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Speech  perception  difficulties  experienced  by  listeners  with  hearing  impairment 
have  been  ascribed  mainly  to  threshold  attenuation  or  a combination  of  threshold 
attenuation  and  distortion  that  is  caused  by  the  hearing  impairment  and/or  environmental 
factors.  One  method  to  investigate  these  difficulties  is  to  simulate  hearing  loss  in  listeners 
with  normal  hearing  through  the  use  of  filtering,  attenuation,  temporal  shifts,  and/or 
competing  noise. 

Most  simulation  studies  have  concluded  that  threshold  attenuation  has  the  primary 
effect  on  speech  perception.  A few  others  report  that  other  distortions  of  the  audible 
signal  are  the  dominant  factor.  Little  data  are  available  when  threshold  attenuation  and 
other  distortions  are  combined.  Also,  data  are  lacking  for  recognition  of  connected 


X 


speech  passages  under  any  listening  condition.  This  study  investigated  the  effects  that 
combined  acoustic  distortions  have  on  the  recognition  of  various  speech  materials  in  quiet 
and  in  noise  by  listeners  with  normal  hearing,  using  the  HELOS,  a hearing  loss  simulator. 

Sixty  subjects  with  normal  hearing,  age  18-30  years,  were  assigned  to  six 
experimental  HELOS  settings.  Each  subject  listened  to  monosyllabic  words,  nonsense 
(CVC)  syllables,  and  connected  speech  passages  under  four  listening  conditions:  with  and 
without  the  HELOS  and  in  quiet  and  noise.  Stimuli  were  presented  at  56  dB  HL  and  in 
signal-to-noise  ratios  of  +8  dB  for  monosyllabic  words  and  nonsense  syllables  and  +2  dB 
for  connected  speech  passages. 

Significant  results  of  an  analysis  of  variance  revealed  that  recognition  of 
monosyllabic  words  and  nonsense  syllables  presented  in  quiet  was  affected  by  threshold 
attenuation.  In  noise,  differences  in  word  and  syllable  recognition  were  found  under 
combined  threshold  and  distortion  conditions,  but  not  for  either  variable  alone.  For 
recognition  of  speech  passages,  threshold  was  a main  effect  with  noise  having  an 
interaction  across  HELOS  conditions.  All  speech  stimuli  were  correlated  within  HELOS 
settings  for  quiet  and  noise  conditions,  suggesting  that  monosyllabic  word  recognition 
scores  may  predict  recognition  of  connected  speech  , particularly  in  noise. 
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CHAPTER  1 
INTRODUCTION 


"Quit  mumbling.  I can't  understand  what  you're  saying" 

"I  can't  hear  you.  What  did  you  say?" 

"I  can  understand  people  just  fine  when  it's  quiet. 

But  if  there  is  any  noise  or  other  people  talking,  forget  it!" 

"People  just  talk  too  fast  for  me  to  understand  them." 

Every  day  these  comments  are  heard  by  hearing  professionals.  They  are  made  by 
individuals  who  are  experiencing  the  consequences  of  hearing  impairment.  Also, 
variations  of  these  comments  are  made  by  the  individual's  family  members,  friends,  and 
others  who  orally  communicate  with  them.  Audible  speech  is  a constant,  seemingly 
automatic  and  vital  part  of  most  human  interactions,  whether  these  interactions  are  direct 
face-to-face  with  the  communication  partner  or  indirect  interactions  through  audio 
technology.  Therefore,  the  inability  to  hear  speech  loudly  and  clearly  impacts  every 
aspect  of  daily  life:  developmental,  educational,  social,  vocational,  and  psychological 
(Ling,  1989). 

The  major  physical  parameters  of  speech  sounds  are  intensity,  frequency  and 
spectrum  with  the  underlying  parameter  of  time  (Martin,  1994).  The  corresponding 
psychological  or  perceptual  parameters  are  loudness,  pitch  and  quality.  Individual  speech 
sounds  and  sound  combinations  are  recognized  and  interpreted  by  the  human  auditory 
system  by  the  relationship  of  these  parameters  occurring  during  a particular  time  period. 


When  there  is  interference  that  changes  these  parameter  relationships,  the  original  speech 
sound  becomes  distorted  (Boothroyd,  1993).  This  interference  and  subsequent  distortion 
can  result  from  an  external  event,  such  as  environmental  noise,  from  an  internal  condition, 
such  as  an  impaired  auditory  perceptual  system  or  from  a combination  of  both.  This  study 
explores  and  discusses  the  perception  of  speech  from  the  perspective  of  the  interference 
created  by  an  impaired  auditory  system  listening  with  and  without  interference  of 
environmental  noise. 

In  studying  the  perceptual  abilities  of  the  impaired  auditory  system,  one  can  be 
overwhelmed  by  the  variability  of  speech  perception  by  individuals  with  hearing 
impairment.  This  variability  is  a consequence  of  complex  and  often  integrated  factors. 
First  and  foremost,  the  complexity  of  the  auditory  system  involves  a range  of  energy 
transference  from  "simple"  mechanical  vibration  to  chemical  interactions,  to  neuro- 
electrical transmissions  and,  finally,  to  multi-level  integration  of  the  present  and  formerly 
stored  electrical  events  (Martin,  1994).  The  complexity  of  the  system  is  increased  further 
by  the  number  of  available  sensory  units  and  their  functional  redundancy.  Second,  the 
type  of  hearing  impairment  and  its  etiology  may  affect  only  a part  of  the  auditory  system 
that,  in  turn,  affects  one  parameter  of  speech  perception  more  than  others  (Martin,  1994). 
For  example,  a conductive  hearing  impairment  resulting  from  physical  obstruction  and 
disease  to  the  outer  and/or  middle  ear,  such  as  with  otitis  media,  will  have  the  greatest 
impact  on  intensity  and  loudness  perception.  Sensorineural  hearing  loss  resulting  from 
damaged  sensory  hair  cells  and  auditory  nerves,  such  as  with  meningitis,  will  have  the 
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greatest  impact  on  spectrum  and  the  perception  of  pitch-loudness-time  relationships  of 
sounds. 

Third,  the  severity  and  location  of  the  damage  within  the  auditory  system  will 
affect  the  range  of  frequencies  perceived  at  any  one  intensity  level  (Clark  & Bohne,  1986). 
Certainly,  the  impact  of  a mild  30  dB  HL  loss  across  all  frequencies  will  be  different  for 
perceiving  the  speech  spectrum  as  compared  to  a hearing  loss  with  normal  sensitivity 
through  1000  Hz  falling  to  a severe  80  dB  HL  loss  at  2000  Hz  and  above. 

Furthermore,  individuals  who  share  commonalities  among  the  above  variables  still 
demonstrate  a wide  range  of  speech  perception  abilities  (Walden,  1984).  The 
interpretation  of  the  sensory  information  sent  by  the  damaged  periphery  is  affected  further 
by  the  physical  integrity  and  "experience"  or  functional  history  of  the  central  auditory 
system  and  associated  language  centers.  This  interpretation  is  influenced  by  the 
developmental  age  of  onset  of  the  hearing  impairment,  age  of  its  identification  and  onset 
of  any  therapeutic  treatment.  Treatment  variables  such  as  type,  duration,  consistency  and 
results  influence  the  individual's  abilities  to  perceive  and  understand  speech.  The 
underlying  language  skills  and  educational  knowledge  level,  along  with  the  psychological 
variables  of  personality,  social-emotional  support  and  development,  all  appear  to  influence 
the  ability  of  an  individual  with  a hearing  impairment  to  cope  with  a distorted  speech 
signal  (Garstecki,  1993). 

Recognizing  the  potential  impact  of  each  and  all  of  these  variables,  the  foundation 
of  speech  perception  abilities  is  the  integrity  of  the  auditory  perceptual  system,  i.e.,  the 
hearing  impairment.  Boothroyd  (1978)  stated  that  “persons  with  sensorineural  hearing 
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loss  use  the  same  acoustic  cues  for  speech  perception  as  do  persons  with  normal  hearing. 
If  they  are  denied  any  of  these  cues,  there  are  no  acoustic  substitutes”  (p.  133).  Therefore, 
if  the  parameters  that  are  critical  for  accurately  perceiving  speech  sounds  could  be 
identified  and  then  compared  to  their  presence,  absence  or  distortion  in  the  impaired  ear, 
the  resulting  information  could  be  applied  to  beneficial  treatment  of  hearing  impairment. 

A variety  of  methods  have  been  used  to  study  how  perception  of  speech  is  affected 
by  changing  or  distorting  the  acoustic  parameters  of  a speech  signal,  e.g.,  use  of  frequency 
filtering,  amplitude  attenuation,  time  compression  and  expansion,  and  various  types  of 
noise  to  compete  with  the  signal.  Those  studies  that  have  related  their  signal 
manipulations  to  what  may  be  occurring  in  the  impaired  auditory  system  have  sometimes 
used  a method  of  simulating  hearing  loss  in  listeners  with  normal  hearing.  It  is  theorized 
that  the  characteristics  of  hearing  loss  have  been  simulated  if  the  speech  perception 
performances  of  listeners  with  normal  hearing  match  the  performances  of  listeners  with 
known  hearing  losses.  Therefore,  the  manipulated  changes  made  to  the  signal  are  similar 
to  the  changes  imposed  by  the  impaired  auditory  system.  Many  studies  have  simulated 
hearing  loss  with  the  use  of  masking  noises  and  bandwidth  filtering  to  attenuate  and  distort 
the  speech  signal  (Bilger  & Wang,  1976;  Fabry  & Van  Tassel,  1986;  Humes  et  al.,  1987a; 
Humes  et  al.,  1987b;  Humes  et  al.,  1988;  Miller  & Nicely,  1955;  Needleman  & Crandell, 
1995;  Owens,  Benedict  & Schubert,  1972;  Sher  & Owens,  1974;  Wang  et  al.,  1978;  Zurek 
& Delhome,  1987).  A few  investigations  have  studied  the  effect  of  temporal  distortions 
on  the  perception  of  speech  signals  by  normal  hearing  listeners  (Florentine  & Buus,  1984; 


Humes  & Roberts,  1990). 
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The  majority  of  these  studies  have  concluded  that  the  primary  effect  of 
sensorineural  hearing  impairment  is  the  attenuation  of  the  speech  signal  through  loss  of 
threshold  sensitivity.  Considering  this  conclusion,  it  would  appear  that  speech  perception 
difficulties  could  be  solved  for  listeners  with  hearing  impairment  by  increasing  the  intensity 
of  the  speech  signal  above  their  hearing  thresholds.  However,  modem  hearing  aids  which 
amplify  sound  do  not  restore  normal  speech  perception  abilities  in  the  majority  of  listeners 
with  hearing  impairment,  especially  in  noisy  environments.  Considering  the  complexity  of 
the  auditory  system  and  the  rapid  occurrence  of  acoustic  events,  perhaps  the  primary 
effect  of  sensorineural  hearing  impairment  is  distortion  of  the  spectmm,  the  inter- 
relationships of  the  basic  parameters  of  sound. 

The  primary  purpose  of  this  study  was  to  investigate  the  effects  that  combined 
parameter  distortions  have  on  the  perception  of  three  types  of  speech  stimuli  by  listeners 
with  normal  hearing.  The  combined  effects  of  intensity  attenuation,  frequency  filtering, 
nonlinear  input-output  function,  noise  at  high  intensities  and  random  time  delays  would 
cause  distortion  of  an  audible  signal.  According  to  Gagne  and  Erber  (1987),  these  factors 
relate  to  the  characteristics  of  loss  of  hearing  sensitivity  and  frequency  selectivity, 
recmitment,  and  reduced  dynamic  range  and  temporal  resolution  often  found  in  cochlear 
hearing  impairments.  The  individual  contribution  of  each  of  these  changes  to  the  overall 
distortion  is  unknown  for  a dynamic  signal  such  as  conversational  speech.  Accurate 
perception  of  speech  units,  or  syllables,  are  not  identified  only  by  inherent  acoustic 
characteristics  but,  also,  by  specific  transitional  cues  that  occur  when  one  phoneme  or 
word  flows  to  the  next  one.  These  transitional  cues  provide  perceptual  cues  critical  for  a 
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phoneme’s  accurate  perception  by  a listener,  regardless  of  whether  that  particular 
phoneme  was  actually  audible  (Boothroyd,  1993;  Liberman  et  al.,  1967).  When  these 
inter-relationships  are  distorted  beyond  an  acceptable  range  for  a listener,  the  phoneme  or 
word  is  misidentified.  In  addition,  the  presence  of  environmental  noise  often  appears  to 
multiply  the  effects  of  these  acoustic  distortions  on  a speech  signal.  Therefore,  part  of  this 
study  will  include  perception  of  the  distorted  speech  stimuli  while  listening  in  background 
noise. 

The  secondary  purpose  of  this  study  is  related  to  one  of  the  future  outcomes  of 
this  research.  Hearing  loss  simulation  has  been  a useful  technique  in  the  audiologic 
rehabilitation  process.  It  can  be  useful  in  empathy  and  communication  strategies  training 
for  professionals  and  for  family  and  friends  of  people  with  hearing  impairments.  One  goal 
of  any  clinical  treatment  is  to  be  accurate,  reliable,  cost  effective  and  convenient.  A small 
hearing  loss  simulation  device,  called  the  HELOS,  is  commercially  available  to  simulate  a 
variety  of  hearing  impairments  in  normal  hearing  listeners.  The  designer  of  the  HELOS 
simulator  claims  that  the  device  can  change  the  speech  signal  to  simultaneously  simulate 
the  loss  of  sensitivity,  frequency  selectivity,  normal  loudness  function  and  dynamic  range, 
and  temporal  resolution  (Erber,  1988;  Gagne  & Erber,  1987).  Therefore,  the  HELOS  was 
used  to  manipulate  the  speech  stimuli  during  this  experiment  to  begin  building  a data  base 
for  future  use.  Analysis  of  its  electroacoustic  output  was  related  to  the  perception  of 
nonsense  syllables,  monosyllables  and  passages  in  quiet  and  background  noise. 


CHAPTER  2 

REVffiW  OF  THE  LITERATURE 


The  major  sections  of  this  chapter  will  review  the  effects  of  sensorineural  hearing 
loss  on  speech  perception  in  terms  of  sensitivity  and  loudness  perception,  frequency 
selectivity,  and  temporal  resolution.  The  focus  of  the  discussion  will  be  previous 
investigations  that  have  used  the  technique  of  hearing  loss  simulation.  First,  however,  a 
brief  overview  of  neurophysiologic  and  psychoacoustic  correlates  of  cochlear  hearing 
impairment  will  lay  the  foundation  for  the  utility  of  hearing  loss  simulation  in  speech 
perception  research. 

Neurophysiologic  and  Psvchoacoustic  Correlates 
of  Sensorineural  Hearing  Impairment 

Although  the  outer  and  middle  ear  systems  influence  the  fidelity  of  the  auditory 
signal  conducted  to  the  inner  ear,  the  mechanical  sensitivity  of  the  cochlea's  basilar 
membrane  has  the  primary  impact  on  the  perception  of  frequency  and  intensity  (Selleck  et 
al.,  1982).  In  the  normal  hearing  cochlea,  it  is  the  movement  of  the  basilar  membrane 
disturbing  cochlear  fluids  and  bending  hair  cell  stereocilia  that  determines  the  sharp 
mechanical  tuning  and,  thus,  sensitivity  to  sounds.  This  movement  is  significantly 
influenced  by  the  motility  of  the  outer  hair  cells  (OHCs)  which  are  believed  to  play  a major 
role  as  a "motor  system”  increasing  or  decreasing  the  action  of  the  basilar  membrane 
(Brownell,  1990;  Moore,  1996).  This  mechanical  action  causes  the  stereocilia  movement 
and  depolarization  of  inner  hair  cells  (IHC).  The  chemical  changes  within  the  IHCs,  in 
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turn,  cause  neuro-transmitters  to  be  released  resulting  in  the  firing/discharging  of  auditory 
nerve  fibers  (Hudspeth,  1985).  Robertson  (1984)  found  that  all  auditory  nerve  fibers  that 
discharge  with  acoustic  stimulation  contact  IHCs  and  mirror  the  output  from  a discrete 
area  of  the  Organ  of  Corti.  It  is  important  to  emphasize  that  although  the  IHCs  are 
responsible  for  transmitting  sensory  information  to  the  afferent  nerve  fibers,  the  integrity 
of  the  OHCs  appears  to  determine  the  presence  and  effectiveness  of  the  active  mechanism 
involved  in  basilar  membrane  movement.  Thus,  damage  to  either  type  of  hair  cell  would 
reduce  the  sensitivity  to  and/or  distort  the  perception  of  sounds. 

Neurophysiologic  studies  of  single  auditory  nerve  fibers  have  allowed  investigation 
of  the  function  of  the  Organ  of  Corti  elements  and  the  effects  of  cochlear  damage. 
Although  the  normal  auditory  fiber  unit  may  fire  over  a range  of  frequencies,  it  is  the  most 
sensitive  at  a characteristic  frequency  (CF)  where  the  least  level  of  sound  intensity  is 
required  for  the  unit  to  respond  (Henderson  et  al.,  1994).  The  shape  of  a fiber's  response 
across  frequencies  is  a tuning  curve,  which  varies  according  to  its  CF.  Therefore,  a normal 
auditory  nerve  fiber  functions  like  a narrowly  tuned  bandpass  filter.  It  passes  or  transmits 
those  signals  near  its  CF  and  rejects  any  extraneous  signals  or  noise  that  have  frequencies 
outside  the  curve  or  skirts  of  the  filter.  This  function  would  suggest  that  an  increase  in  the 
bandwidth  of  these  filters  could  result  in  a reduced  capability  to  detect  a signal  embedded 
in  noise,  or  reduced  frequency  selectivity  (Henderson  et  al.,  1994;  Tyler  et  al.,  1982).  The 
reduced  tuning  would  allow  the  filter  to  pass  more  noise,  thus  decreasing  the  signal-to- 


noise  (S/N)  ratio. 
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These  effects  of  reduced  tuning  have  been  seen  in  animal  studies  where  cochlear 
damage  has  been  induced  with  noise  or  ototoxic  drugs.  Behavioral  and  neural  measures 
have  shown  changes  in  threshold  sensitivity  and  frequency  tuning  of  auditory  fibers 
(Pickles,  1988).  The  shape  of  the  tuning  curves  typically  broadens  and  becomes  more 
shallow,  with  the  tip  widening  and  occurring  at  higher  thresholds  compared  to  the  normal 
fiber.  Histological  studies  of  these  animals  have  shown  reduction  and  total  loss  of  hair  cell 
stereocilia  and  complete  hair  cells  in  relatively  discrete  areas  in  the  cochlea  which 
correspond  to  a signal's  frequency. 

Psychophysical  tuning  curves  (PTC)  and  auditory  filter  patterns  of  listeners  with 
sensorineural  hearing  loss  have  shown  similar  characteristics  (Pickles,  1988).  Depending 
on  the  degree  and  configuration  of  the  hearing  loss,  the  PTC  may  become  so  broadened 
and  distorted,  it  looks  more  like  a flat  line  (Preminger  & Wiley,  1985),  while  the 
bandwidth  of  the  auditory  filter  pattern  often  increases  with  the  increase  in  hearing  loss 
(Pickles,  1988).  Preminger  and  Wiley  compared  the  PTCs  of  pairs  of  listeners  matched 
according  to  audiometric  thresholds  but  who  differed  (p<,05)  in  word  intelligibility  scores. 
The  PTC  of  the  listener  with  better  speech  perception  was  consistently  closer  to  normal 
appearance  than  the  listener  with  poor  speech  perception.  These  results  support  the 
connection  between  loss  of  normal  frequency  sensitivity  and  selectivity  to  understanding 
speech. 

Intensity  coding  and  the  perception  of  loudness  growth  in  the  cochlea  appear  to  be 
complex  responses  which  involves,  at  the  least,  basilar  membrane  mechanical  vibration  and 
changes  in  hair  cell  intracellular  voltage  and  polarization.  The  result  is  release  of  a 


10 


chemical  transmitter  that  leads  to  the  discharge  and  saturation  of  auditory  nerve  fibers 
(Pickles,  1988).  Evans  (1975)  and  Viemeister  (1983)  observed  that  as  the  intensity  of  a 
given  frequency  increased,  a specific  population  of  nerve  fibers  with  low  thresholds 
increased  their  firing  rate.  As  the  signal's  intensity  increased,  fibers  with  moderate-to-high 
thresholds  and  a CF  identical  to  the  signal  began  to  discharge.  In  addition,  nerve  fibers 
with  adjacent  CFs  began  to  fire  or  were  “recruited”  as  part  of  the  total  neural  discharge. 
Neural  studies  of  chinchillas  with  noise  induced  hearing  loss  have  shown  that  a higher 
number  of  nerve  fibers  discharge  once  threshold  was  reached  (Salvi  et  al.,  1983).  These 
data  support  the  theory  that  reduced  frequency  selectivity  is  the  major  factor  involved  in 
loudness  recruitment. 

However,  Moore  ( 1 996)  reviews  more  recent  studies  that  suggest  frequency 
selectivity  is  a minor  factor  and  that  “loudness  recruitment  results  from  the  loss  of  a fast- 
acting compressive  nonlinearity  that  operates  in  the  normal  peripheral  auditory  system”  (p. 
138).  This  statement  suggests  that  the  major  cause  of  abnormal  loudness  growth  is  not 
the  inability  of  the  damaged  auditory  system  to  hear  the  signal’s  parameters  but  its  inability 
to  accurately  process  the  wide  range  of  very  rapid  changes  within  an  acoustic  signal. 
Ruggero  and  Rich  (1991)  observed  that  the  input-output  functions  of  the  basilar 
membrane  changed  after  chinchillas  were  injected  with  flirosemide.  In  the  normal  ear,  the 
shape  of  the  function  curve  indicated  the  basilar  membrane  had  a compressive  non- 
linearity for  intensities  between  30  and  90  dB  SPL.  After  the  injection,  the  function  curve 
changed,  indicating  more  signal  intensity  was  required  to  elicit  a response,  and  that  a much 
steeper  or  rapid  increase  of  membrane  movement  was  observed  at  the  point  of  the  signal’s 
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CF.  Moore,  Wojtczak,  and  Vickers  (1996)  reported  psychophysical  data  supporting  this 
theory.  Loudness  matching  and  modulation  matching  functions  were  obtained  in  listeners 
with  unilateral  hearing  losses.  Modulation  rates  used  were  representative  of  the  range  of 
modulations  present  in  the  speech  envelope.  Subjects  matched  a smaller  modulation  depth 
in  the  impaired  ear  to  a larger  modulation  depth  in  the  normal  ear.  The  authors  concluded 
that  the  loss  of  a fast-acting  compressive  nonlinearity  in  the  peripheral  auditory  system 
resulted  in  loudness  recruitment,  thus  exaggerating  and  distorting  a dynamic  signal  such  as 
speech. 

Psychoacoustic  data  specifically  measuring  frequency  selectivity  were  reported  by 
Florentine,  Buus,  Scharf  and  Zwicker  (1980)  and  Festen  and  Plomp  (1983).  Florentine  et 
al.  (1980)  compared  frequency  selectivity  ability  in  listeners  with  normal  hearing  and  with 
various  types  of  hearing  impairment  by  using  narrow  band  and  two-tone  masking,  a 
loudness  summation  function  and  by  determining  PTCs.  The  authors  found  that  frequency 
selectivity  was  reduced  within  the  frequency  range  of  the  cochlear  hearing  loss  and  that  it 
was  negatively  correlated  with  the  degree  of  cochlear  hearing  loss.  These  findings  were 
supported  further  when  reduced  frequency  selectivity  was  revealed  as  the  bandwidth  of 
the  PTC  increased  and  the  slopes  decreased.  Overall,  listeners  with  cochlear  hearing 
impairment  performed  significantly  poorer  in  frequency  selectivity  than  listeners  with 
normal  hearing  did  at  equal  intensity  levels  (SPL). 

Similar  results  were  reported  by  Festen  and  Plomp  (1983)  who  studied  the 
relationships  between  auditory  functions  of  twenty-two  listeners  with  moderate 
sensorineural  hearing  impairment.  Deriving  frequency  resolution  measures  using 
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simultaneous  and  nonsimultaneous  masking  with  comb-filtered  noise,  data  indicated 
frequency  resolution  was  “approximately  independent  of  audiometric  loss”  (p,  652). 
Perhaps  more  important  is  the  finding  that  65%  of  the  total  variance  between  subjects  was 
explained  by  hearing  threshold  and  frequency  resolution.  This  analysis  suggested  that 
several  other  parameters  contribute  to  auditory  function. 

The  last  neurophysiologic  correlate  to  review  involves  the  parameter  of  time.  The 
time  patterns  or  phase-locking  of  neural  impulses  in  the  auditory  nerve  appear  to  code  a 
signal’s  time  properties.  The  effect  of  cochlear  damage  on  phase-locking  is  unclear 
(Moore,  1996).  Although  Harrison  and  Evans  (1979)  found  phase-locking  was  unaffected 
by  kanamycin  damaged  hair  cells  in  the  guinea  pig,  Woolf,  Ryan  and  Bone  (1981)  found 
reduced  phase-locking  in  similarly  damaged  cochleas  of  chinchillas. 

The  time  properties  of  temporal  resolution  and  integration  may  be  processes 
reflecting  the  performance  of  the  central  auditory  system  more  than  the  peripheral  system. 
Forward  masking  paradigms  have  been  used  to  study  the  adaptation  and  recovery  rates  in 
auditory  nerve  fibers.  Henderson  et  al.  (1994)  reviews  Salvi,  Saunders,  Ahroon, 

Shivapuja  and  Arehole’s  (1986)  study  with  noise  damaged  chinchillas  that  showed  a 
prolonged  rate  of  recovery  from  short-term  adaptation.  Supporting  these  data  are  forward 
masking  experiments  in  psychophysic  investigations  which  have  demonstrated  that 
listeners  with  impaired  hearing  have  a slower  recovery  rate  when  tested  at  the  same 
intensity  levels  as  listeners  with  normal  hearing  (Moore,  1996). 

In  his  review  of  studies  on  modulation  rate  detection  and  gap  detection,  Moore 
discusses  the  many  factors  influencing  temporal  resolution,  i.e.,  frequency,  intensity  level. 
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and  a “central  decision  process”  (p.  140).  As  a general  rule,  these  studies  demonstrated 
that  listeners  with  sensorineural  hearing  impairment  were  less  sensitive  and  performed 
markedly  worse  at  every  task  than  did  listeners  with  normal  hearing.  However,  it  was 
unclear  if  their  poorer  performance  was  solely  due  to  abnormal  temporal  resolution  or 
another  complicating  factor  such  as  the  inaudibility  of  certain  frequencies  and  loudness 
recruitment. 

Finally,  it  is  unclear  how  much  the  cochlea  affects  temporal  integration.  Using 
changes  in  absolute  thresholds  as  a function  of  signal  duration  as  an  indicator  of  temporal 
integration,  listeners  with  sensorineural  hearing  loss  usually  have  smaller  changes  in 
thresholds  than  listeners  with  normal  hearing  (Solecki  & Gerken,  1990).  Also,  this 
reduction  of  temporal  integration  appears  to  be  more  severe  for  long  sounds  than  short 
sounds  (Moore,  1996).  However,  Salvi  et  al.  (1983b)  did  not  find  abnormal  decay  in 
auditory  nerve  firing  with  increased  duration  of  a signal  in  damaged  cochleas.  Regardless 
of  the  origin  of  reduced  temporal  integration,  the  characteristic  intensity  and  duration  of 
individual  speech  sounds  would  probably  affect  their  audibility  to  a listener  with  a hearing 
loss,  thus  affecting  their  understanding  of  speech. 

In  conclusion  of  this  section,  the  data  presented  do  not  support  the  idea  that 
threshold  sensitivity  is  the  major  factor  in  speech  perception.  Henderson  et  al.  (1994) 
state  the  following  about  the  neurophysiologic  changes  that  accompany  sensorineural 
hearing  loss. 

It  is  now  apparent  from  both  psychophysical  and  physio-logical  studies  that  SNHL 
involves  more  than  a simple  loss  of  sensitivity.  Rather,  a complex  series  of  changes 
occurs  in  the  neural  coding  of  sound  which  in  turn  gives  rise  to  a myriad  of  auditory 
perceptual  distortions  in  the  hearing-impaired  listener,  (p.53) 
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Effects  of  Sensorineural  Hearing  Loss  on  Speech  Perception 

It  is  consistently  found  in  clinical  audiology  that  patients  with  the  same  degree  and 
configuration  of  sensorineural  hearing  loss  will  perceive  speech  differently  in  a substantial 
way  (Preminger  & Wiley,  1985).  These  patients  also  differ  in  the  benefit  received  from 
amplification  even  when  fit  with  the  same  model  of  hearing  aid.  Older  studies  (Elliott, 
1963;  Menzel,  1958;  Mullins  & Bangs,  1957;  Young  & Gibbons,  1962)  found  that  the 
highest  correlation  between  speech  recognition  abilities  and  pure  tone  sensitivity  was  pure 
tone  thresholds  at  2000  Hz  and  3000  Hz. 

In  1980,  Yoshioka  and  Thornton  analyzed  various  audiometric  measures  of  529 
ears  to  determine  predictors  of  speech  recognition  performance.  They  also  found  the 
threshold  at  2000  Hz  to  be  the  best  single  predictor  (r=.71)  of  recognition  of  CID  W-22 
words.  Of  the  derived  values,  the  pure  tone  average  (PTA)  had  the  highest  correlation 
(r=.75)  with  speech  recognition  scores  while  slope  or  configuration  had  the  least 
relationship  (i^.34).  However,  these  relationships  appeared  to  be  consistently  true  for 
mild  hearing  losses  only.  Yoshioka  and  Thornton  found  as  the  degree  of  hearing  loss 
increased  into  the  moderate  and  severe  categories,  the  variability  of  speech  recognition 
scores  had  an  overriding  effect  on  any  predictive  value  or  system.  The  importance  of 
these  results  is  supported  by  the  large  number  of  subjects/ears  studied  and  the  well  defined 
criteria  used  for  acceptable  audiometric  measures.  An  important  limitation  in  generalizing 
these  results  is  that  the  only  speech  recognition  performance  score  used  was  for  W-22 
monosyllabic  words  presented  in  quiet  at  40  dB  re:  SRT. 
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In  their  analysis  of  consonant  confusions  made  by  listeners  with  hearing  loss, 
Bilger  and  Wang  (1976)  found  that  consonant  error  patterns  and  overall  performance 
(percent  accurate  response)  were  independent  of  each  other,  sometimes  dramatically. 
Grouped  according  to  similar  audiograms,  some  subjects  had  similar  error  patterns  but 
very  different  overall  speech  recognition  scores.  The  converse  situation  was  found  also. 
These  findings  suggested  that  an  overall  performance  score  may  reflect  use  of  different 
phonetic  features  across  listeners  with  similar  audiometric  configurations.  These  results 
are  in  contrast  with  Danhauer  and  LaWarre  (1979)  who  found  listeners  with  normal 
hearing  and  those  with  impaired  hearing  used  highly  similar  features,  independent  of 
degree  of  hearing  loss.  A succinct  summary  of  the  predictive  value  of  the  audiogram  for 
speech  perception  abilities  was  provided  by  Humes,  Tharpe  and  Bratt  (1984)  when  they 
stated  that  the  audiogram  provides  only  a very  rough  estimate  of  the  underlying  cochlear 
and/or  eighth  nerve  pathology. 

It  is  not  being  proposed  that  the  pure  tone  audiogram  is  useless  in  assessing 
communicative  function.  It  is  the  clinical  measurement  of  a person’s  threshold  sensitivity. 
Audibility  of  a signal  above  a person’s  hearing  thresholds  is  absolutely  necessary  for 
accurate  speech  recognition  through  the  auditory  channel.  Otherwise,  information  about 
the  speech  spectrum  is  lost.  Sher  and  Owen  (1974)  reviewed  evidence  from  the  previous 
fifty  years  that  indicated  the  necessity  of  audible  acoustic  cues  above  2000  Hz  for 
recognizing  words  with  high  frequency  phonemes,  especially  in  the  presence  of  noise  and 
other  forms  of  distortion.  Owens,  Benedict  and  Schubert  (1972)  analyzed  consonant 
errors  made  by  listeners  with  hearing  impairment  who  had  various  audiometric 
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configurations.  They  found  recognition  of  words  with  phonemes  /s,  t,  W to  be  highly 
dependent  on  the  presence  of  frequencies  above  2000  Hz.  Words  with  phonemes 
75,  t5,  dy  were  highly  dependent  on  frequencies  between  1000  and  2000  Hz  being 
present.  These  patterns  were  duplicated  in  listeners  with  normal  hearing  under  low  pass 
filtering  conditions.  These  results  indicated  that  the  recognition  errors  made  were  related 
to  the  available  frequency  information  and  not  to  other  variables  within  the  listeners  with 
hearing  impairment. 

Bilger  and  Wang  (1976)  did  a post-hoc  analysis  of  data  on  consonant  confusions 
by  listeners  with  hearing  impairment  according  to  audiometric  configurations.  Their 
findings  agreed  with  Owen  et  al.  (1972)  in  that  listeners  with  high  frequency  hearing  losses 
had  particular  difficulty  with  sibilant  sounds.  However,  unlike  Owen  et  al.,  specific  errors 
did  not  relate  only  to  specific  audiometric  configurations.  Attributing  this  conflict  to 
different  methodology,  Bilger  and  Wang’s  overall  conclusion  was  that  “patterns  of 
consonant  perception  in  patients  with  sensorineural  hearing  loss  may  be  predicted  from 
audiometric  data”  (p.739).  Although  audiologists  may  use  these  data  as  the  foundation  of 
their  general  explanation  of  speech  sound  audibility  to  a patient,  specific  data  do  not 
strongly  support  audibility  to  be  the  dominating  factor  in  speech  recognition.  Also,  it  is 
unknown  to  what  extent  nonsense  syllable  data  can  be  generalized  to  word  and  sentence 
stimuli. 

A better  controlled  study  by  Suter  (1985)  compared  hearing  thresholds  to 
recognition  of  monosyllable  words  and  sentences  by  listeners  with  hearing  impairment. 
Although  her  purpose  was  to  disprove  the  test  validity  of  the  1959  AAOO  Hearing 
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Handicap  Rating  procedure  still  being  used  in  many  states,  her  data  reflect  differences  in 
speech  recognition  performance  between  subject  groups,  speech  stimuli,  and  S/N  ratios. 
Testing  forty-eight  subjects,  who  were  all  rated  by  the  1959  procedure  as  having  no 
hearing  handicap,  were  divided  into  three  groups  based  on  hearing  loss.  Recognition 
scores  on  the  CID  Everyday  Sentence  Test  and  the  Modified  Rhyme  Test  were  obtained  in 
sound  field  under  three  different  S/N  ratios  using  12-talker  babble.  There  were  significant 
interactions  between  subject  groups  and  S/N  ratios  and  between  speech  stimuli  and  S/N 
ratios.  These  findings  also  indicated  significant  differences  between  the  three  groups 
under  all  listening  conditions  for  both  speech  stimuli,  with  marked  increases  in  differences 
as  S/N  ratio  decreased.  Ad  hoc  regrouping  of  subjects  according  to  pure  tone  thresholds 
at  2000  Hz,  3000  Hz,  and  4000  Hz  revealed  even  greater  differences  in  performance 
scores  between  subjects. 

Moore  (1996)  discusses  the  Articulation  Index  (AI)  (Fletcher,  1952;  French  and 
Steinberg,  1947)  as  a method  to  quantify  the  importance  of  threshold  sensitivity  to  speech 
intelligibility.  The  AI  is  based  on  the  calculated  abilities  of  listeners  with  normal  hearing  to 
hear  the  long  term  average  spectra  of  speech  and  a background  sound.  Moore  cites  a few 
studies  that  have  shown  that  the  AI  can  accurately  predict  speech  intelligibility  in  listeners 
with  hearing  impairment  (e.g.,  Lee  & Humes,  1993).  However,  he  cites  numerous  studies 
that  have  found  speech  intelligibility  worse  than  what  was  predicted  by  the  AI,  particularly 
for  listeners  with  moderate  or  severe  hearing  losses  (e  g.,  Dubno,  Dirks,  & Schaefer, 

1989;  Pavlovic,  Studebaker  & Sherbecoe,  1986;  Smoorenburg,  1992).  Most  studies  did 
find  that  AI  predictions  were  accurate  for  listeners  with  mild  losses.  These  results  suggest 
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that  pure  tone  threshold  sensitivity  is  not  the  only  factor  involved  in  speech  intelligibility 
for  individuals  with  hearing  losses  greater  than  35  to  40  dB  HL. 

Another  way  to  evaluate  the  role  of  threshold  sensitivity  to  speech  intelligibility  is 
to  use  a speech  reception  threshold  (SRT)-in-noise  function  (Festen  and  Plomp,  1983; 
Plomp,  1978,  1986;  Smoorenburg,1992).  It  is  important  to  note  that  these  authors,  based 
in  the  Netherlands,  use  sentences  rather  than  spondee  words  to  determine  SRT.  Assuming 
that  the  SRT  performance  of  listeners  with  hearing  impairment  is  limited  by  the  parts  of 
the  speech  spectrum  occurring  below  their  absolute  threshold,  then  an  SRT-to-noise-ratio- 
threshold  level  should  decrease  progressively  as  the  overall  noise  level  is  increased.  This 
SRT  decrease  would  occur  as  the  previously  inaudible  speech  spectrum  rose  above 
absolute  threshold.  At  high  noise  levels,  there  should  be  little  difference  between  SRTs  of 
listeners  with  hearing  impairment  and  with  normal  hearing.  Although  Lee  and  Humes 
(1993)  reported  results  consistent  with  these  predictions  for  elderly  listeners  with  mild 
hearing  impairment,  the  results  of  the  studies  by  Plomp  and  Smoorenburg  did  not.  For  all 
noise  levels  above  a certain  value,  the  SRT  of  listeners  with  hearing  impairment  remained 
relatively  constant  and  remained  greater  than  for  listeners  with  normal  hearing  at  high 
noise  levels.  These  results  would  indicate  that  factors  other  than  audibility  of  the  signal 
are  involved  in  order  for  the  listener  with  hearing  impairment  to  understand  speech. 

Finally,  two  studies  using  nonsense  syllables  (i.e.  six  stop  consonants)  compared 
the  importance  of  signal  audibility  in  the  performance  of  listeners  with  normal  hearing  and 
with  hearing  impairment  (Dubno,  Dirks  & Ellison,  1989;  Turner  & Robb,  1987).  Turner 
and  Robb  used  a filtering  model  to  estimate  the  audible  portion  of  each  stop-consonant 
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spectmm  at  each  presentation  level.  Recognition  scores  for  listeners  with  hearing 
impairment  were  lower  consistently  than  for  listeners  with  normal  hearing  at  all 
presentation  levels,  even  when  estimated  audibility  was  nearly  100%.  Dubno,  et  al  used 
low  and  high  pass  filtering  to  determine  what  frequency  bandwidths  (BWs)  were  necessary 
for  perception  of  consonant  place  cues  by  listeners  with  normal  hearing  and  with  high 
frequency  hearing  loss.  Under  low  pass  filtering  conditions,  no  differences  were  found 
between  listener  groups  in  stop-consonant  recognition  or  error  patterns.  In  other  words, 
no  differences  were  found  for  BWs  corresponding  to  normal  hearing  regions  of  both 
groups.  However,  under  high  pass  filtering  conditions,  significant  differences  in 
recognition,  error  patterns,  and  vowel  context  were  observed  between  groups.  The 
authors  concluded  that  “the  reduction  in  audibility  for  normal-hearing  listeners  provided 
by  fixed-frequency,  low  pass  filters  did  not  appropriately  model  changes  in  recognition 
resulting  from  high-frequency  hearing  loss”  (p.354). 

The  overall  results  of  these  studies  reviewed  indicate  that  audibility  of  the  signal 
which  is  decreased  through  loss  of  threshold  sensitivity  in  the  listener  is  a critical  factor  in 
speech  perception.  However,  this  factor  may  not  be  the  dominant  contributor  to 
difficulties  in  understanding  speech  by  all  listeners  with  hearing  impairment.  For  listeners 
with  mild  losses  (<40-45  dB  HL),  threshold  sensitivity  does  appear  to  be  the  major  factor 
in  speech  perception.  This  finding  is  expressed  by  patients  who  say,  “I  hear  just  fine  when 
people  just  speak  up  a little.”  However,  for  listeners  with  moderate  to  severe 
sensorineural  hearing  impairments,  speech  perception  difficulties  are  more  influenced  by 
the  audible  or  suprathreshold  stimuli  and  their  acoustic  parameters  (Moore,  1996).  This 


20 


finding  is  demonstrated  by  the  patient  who  reports,  “I  can  hear  people  talking.  They’re 
plenty  loud.  The  words  just  don’t  sound  clear  and  distinct.” 

Suprathreshold  stimuli  distortions  associated  with  speech  recognition  difficulties 
were  delineated  by  Levitt  in  the  1982  Vanderbilt  Hearing  Aid  Report.  The  list  included 
the  following:  a)  distortions  of  loudness  relationships,  b)  reduced  linear  range  of  the 
auditory  system,  c)  reduction  in  sensitivity  to  intensity  changes,  d)  reduction  in  frequency 
sensitivity,  e)  broadened  critical  bands  (or  auditory  filters),  f)  poorer  ability  to  extract 
signals  from  noise,  g)  greater  spread  of  masking  effects,  h)  poorer  temporal  processing 
and  i)  nonlinear  distortion  components. 

When  discussing  these  difficulties,  basic  acoustics  of  speech  need  to  be  considered. 
The  sound  energy  of  stressed  speech  syllables  has  a range  of  about  30  dB  between  the 
strongest  vowels  and  the  weakest  consonants.  This  dynamic  range  increases  to  40  dB  for 
running  speech  which  includes  unstressed  syllables  (Boothroyd,  1993).  Part  of  this  range 
can  be  accounted  for  by  the  6 dB/octave  decrease  in  intensity  as  frequency  increases. 
Performance/intensity  fianctions  of  various  speech  stimuli  by  listeners  with  normal  hearing 
demonstrate  this  dynamic  range  of  speech  and  the  easy  accommodation  of  the  wide  range 
of  intensities.  Listeners  with  hearing  impairment  have  a reduced  dynamic  range  due  to 
elevated  thresholds.  In  addition,  once  the  signal  does  becomes  audible,  the  perception  of 
its  loudness  growth  is  not  linear  until  it  reaches  a high  suprathreshold  intensity.  According 
to  Boothroyd,  these  characteristics  of  reduced  dynamic  range,  compressed  or  “recruited” 
loudness  function  and  reduced  linear  range  distort  the  normal  phonemic  relationships  and 
contrasts  in  speech.  Thus,  recognition  probability  is  reduced.  Studies  involving  amplitude 
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modulation  and  pulsed  tones  (Glasberg  & Moore,  1989;  Moore,  Wojtczak  & Vickers, 
1996;  Schroder,  Viemeister  & Nelson,  1994)  have  shown  that  listeners  with  hearing 
impairment  perform  equally  or  better  than  listeners  with  normal  hearing  on  intensity 
discrimination  and  resolution  tasks  when  listening  at  the  same  sensation  level  (SL),  When 
comparing  at  equal  sound  pressure  levels  (SPLs),  the  performances  of  listeners  with 
hearing  impairment  were  equal  or  worse  than  listeners  with  normal  hearing.  People  with 
hearing  impairment  often  listen  to  everyday  conversation  at  low  sensation  levels  (SLs). 
Therefore,  whatever  information  that  intensity  cues  provide  to  the  recognition  of  a rapidly 
changing  speech  spectrum  will  be  distorted  by  cochlear  impairment  (Boothroyd,  1993). 

The  reduction  of  frequency  sensitivity  and  selectivity  affects  the  ability  to  hear 
phonemic  contrasts,  particularly  consonant  place  cues,  and  to  separate  frequency 
components  of  complex  sounds.  Frequency  selectivity  or  resolution  is  most  critical  in 
speech  recognition  in  noise.  In  their  study  of  auditory  functions,  Festen  and  Plomp  (1983) 
found  a high  correlation  (r=.83)  between  mean  audiometric  hearing  loss  and  speech 
recognition  of  sentences  in  quiet.  However,  for  speech  recognition  in  noise  the  highest 
correlations  (r=.49  to  .63;  p>.01)  were  with  frequency  resolution  measures.  A low 
insignificant  correlation  (r=.3 1)  was  found  between  speech  recognition  in  noise  and 
audiometric  mean.  The  authors  concluded  that  speech  recognition  in  quiet  for  listeners 
with  moderate  hearing  losses  was  governed  by  audiometric  loss,  whereas  speech 
recognition  in  noise  is  more  closely  related  to  frequency  resolution. 

Based  on  their  psychoacoustic  data  previously  reviewed,  Florentine  et  al.  (1980) 
suggested  that  gradual  decrease  in  frequency  selectivity  may  account  for  the  marked 
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decrease  in  speech  recognition  seen  in  listeners  with  hearing  impairment  greater  than  30- 
40  dB  HL.  The  broadened  auditory  filters  might  include  the  first  and  second  formants  of 
vowels,  thus  making  it  difficult,  if  not  impossible,  to  discriminate  between  two  phonemes. 
In  addition,  the  higher  intensity  of  FI  would  have  a forward  masking  effect  on  F2  and  high 
frequency  cues  of  consonants.  This  effect,  coupled  with  the  masking  effect  of 
environmental  noise  and  the  reduced  ability  to  separate  out  meaningful  speech  cues,  may 
be  a major  reason  why  listeners  with  hearing  impairment  have  so  much  difficulty 
recognizing  speech  in  noise. 

The  third  major  acoustic  parameter  of  time  also  affects  speech  recognition.  The 
dynamic  fluctuations  of  intensity  and  frequency  within  the  speech  spectrum  occur  over 
time.  Phonemes  have  inherent  temporal  characteristics  that  distinguish  them  from  each 
other,  such  as  vowel  versus  consonant  and  stop-plosive  versus  sibilant  (Formby,  Barker  & 
Raney,  1993).  The  majority  of  the  literature  related  to  the  temporal  properties  of  speech 
are  either  psychoacoustic  studies  or  investigations  of  central  auditory  processing.  Review 
of  the  large  body  of  literature  concerning  central  auditory  abilities  is  beyond  the  scope  of 
this  work. 

However,  one  investigation  that  included  time  variables  tested  listeners  who  had 
characteristics  typical  of  cochlear  loss  (Lacroix  & Harris,  1979).  The  speech  recognition 
of  forty-five  listeners  with  hearing  impairment  was  evaluated  for  sentences  presented 
under  several  distortion  conditions,  which  included  compression  and  interruption  of  the 
signal.  Results  indicated  performance  decreased  for  both  of  these  variables  as  the 
frequency  range  of  hearing  impairment  decreased  to  2000  Hz.  Those  listeners  with 
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hearing  loss  at  4000  Hz  and  above  performed  equally  with  listeners  with  hearing 
impairment. 

Finitzo-Hieber  and  Tillman  (1978)  and  Nabelek  and  Mason  (1981)  demonstrated 
the  deleterious  effect  of  increased  reverberation  time  on  the  speech  recognition  of  listeners 
with  hearing  impairment  in  quiet  conditions  in  comparison  to  listeners  with  normal 
hearing.  Although  reverberation  is  referred  to  in  units  of  time,  by  definition  reverberation 
is  the  measurement  of  the  reduced  intensity  over  time  of  a particular  signal  (Skinner, 

1988).  When  the  signal  is  an  ongoing  dynamic  signal  such  as  speech,  each  succeeding 
sound  wave  mixes  and  is  changed  by  the  still  present  preceding  wave  which  now  provides 
an  interference  or  competing  signal.  The  result  is  a "smearing"  of  frequency 
characteristics  and  a distortion  of  intensity.  The  relationships  of  the  signal's  basic 
parameters  have  become  distorted.  If  there  is  additional  unrelated  noise  in  the 
background,  listeners  with  hearing  impairment  have  even  poorer  speech  recognition,  even 
though  the  signal's  spectrum  may  be  totally  audible  (Nabelek  & Mason,  1981;  Turner  & 
Robb,  1987). 

The  above  information  indicates  that  speech  perception  by  listeners  with  hearing 
impairment  is  not  only  affected  by  multiple  variables  but  by  their  combined  effects  and 
interactions.  Plomp  (1978)  proposed  a model  of  speech  recognition  that  demonstrates 
that  the  difficulties  that  listeners  with  hearing  impairment  experience  are  a result  of 
attenuation  and  other  distortions  that  are  caused  by  the  hearing  impairment.  The  listeners' 
elevated  thresholds  cause  a signal's  spectral  energy  to  be  partially  or  totally  inaudible. 
Distortions  of  the  audible  portions  of  the  signal  are  created  by  the  impaired  auditory 
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mechanisms.  These  distortions  are  reduced  frequency  sensitivity  and  selectivity, 
intensity/loudness  functions,  and  temporal  resolution  and  integration.  Other  investigations 
support  Plomp  in  stating  that  while  audibility  plays  the  dominating  role  in  understanding 
speech  in  quiet,  it  is  other  distortions  that  dominate  the  difficulties  that  listeners  with 
hearing  impairment  have  in  understanding  speech  in  noise  (Baer  & Moore,  1993;  Festen 
and  Plomp,  1983;  Gagne  & Erber,  1987;  Moore  & Glasberg,  1993;  Plomp,  1978,1986; 
Villchur,  1974,  1977). 

Briefly,  Plomp’s  model  is  based  on  the  measurement  of  the  speech  recognition 
thresholds  (SRTs)  of  listeners  with  hearing  impairment  in  quiet  and  in  several  levels  of 
noise  which  allow  for  full  audibility  of  the  speech  signal.  These  SRT  measurements  are 
compared  to  the  SRTs  of  listeners  with  normal  hearing  obtained  under  similar  conditions. 
Listeners  with  hearing  impairment  generally  have  higher  SRTs  in  both  quiet  and  noise 
(Humes  et  al.,  1988).  The  difference  between  SRTs  of  the  two  groups  in  quiet  is 
interpreted  as  the  result  of  audibility  (A)  and  distortions  (D),  designated  as  SHLa+d.  The 
difference  between  SRTs  in  noise  is  interpreted  as  the  result  of  distortions,  i.e.  SHLd, 
since  the  audibility  factor  has  been  accounted  for  by  the  procedure.  Plomp  (1976) 
postulates  that  quantifying  the  components  of  the  "D  factor"  would  provide  the  missing 
link  in  helping  listeners  with  hearing  impairment  understand  speech. 

Humes,  Espinoza- Varas,  and  Watson  (1988)  used  several  model  equations, 
including  Plomp's,  to  evaluate  data  from  previous  psychoacoustic  studies  of  frequency 
resolution,  temporal  resolution  and  speech  recognition.  These  data  were  obtained  from 
listeners  with  normal  and  impaired  hearing  who  performed  under  noise  masking 
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conditions.  The  authors  concluded  that  "at  high  input  levels,  the  convergence  of  the 
(model  input-output)  functions  implies  it  makes  little  difference  whether  the  noise 
underlying  threshold  elevation  is  externally  or  internally  introduced"  (p.201).  However, 
they  also  stated  that  the  similar  performance  of  listeners  with  true  or  simulated  hearing 
impairment  should  not  be  interpreted  that  either  listener  group  has  normal  processing 
abilities.  Both  groups  demonstrated  processing  deficits  when  compared  to  the  listener 
with  normal  hearing.  Humes  et  al.  summarized  their  findings  by  stating  that  using  an 
elevation  of  thresholds  to  indicate  internal  noise  of  an  impaired  auditory  system  needs  to 
be  examed  further  as  a method  to  model  sensorineural  hearing  loss. 

Speech  Perception  Studies  Using  Hearing  Loss  Simulation 

One  long  established  method  of  investigating  the  speech  perception  difficulties  of 
listeners  with  hearing  impairment  is  to  simulate  hearing  loss  in  listeners  with  normal 
hearing.  Their  understanding  of  speech  under  various  acoustic  conditions  is  then 
compared  to  the  speech  perception  of  listeners  with  hearing  impairment.  If  their 
perceptual  responses  "match"  or  are  similar,  it  is  theorized  that  the  experimental  changes 
made  to  the  acoustic  signal  are  similar  to  the  changes  imposed  on  the  signal  by  an 
impaired  auditory  system.  Although  some  of  the  studies  previously  discussed  have  used 
this  method,  this  section  will  highlight  further  investigatiions  attempting  to  simulate 
hearing  impairment  and  the  resulting  effects  on  speech  recognition. 

Various  approaches  have  been  used  to  simulate  hearing  loss  from  the  use  of  simple 
earplugs  or  a masker  hearing  aid  (Chafin  & Peipher,  1979;  Cox,  1977)  to  the  use  of  more 
complicated  laboratory  masking  and  filtering  equipment.  An  often  reported  approach  is  to 
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raise  the  pure  tone  thresholds  of  listeners  with  normal  hearing  by  using  spectrally  shaped 
masking  noise  until  they  approximate  the  thresholds  of  listeners  with  hearing  impairment. 
The  general  conclusions  of  investigators  have  been  that  one  of  two  major  hypotheses 
explain  the  majority  of  speech  recognition  difficulties  experienced  by  listeners  with  hearing 
impairment.  The  first  hypothesis  is  that  the  distortion  through  hearing  threshold 
sensitivity  causes  the  speech  signal  to  become  inaudible  or  attenuated  (French  & 

Steinberg,  1947;  Fabry  & Van  Tassel,  1986;  Humes,  1991;  Humes,  Dirks,  Bell  & Kincaid, 
1987,  Humes,  Espinoza- Varas  & Watson,  1988;  Humes  & Roberts,  1990;  Kamm,  Dirks, 

& Bell,  1985;  Wang,  Reed,  & Bilger,  1978,  and  Zurek  & Delhome,  1987).  The  second 
hypothesis  is  that  psychoacoustic  distortions  that  may  originate  in  the  peripheral  and/or  in 
the  central  auditory  system  accompany  pure  tone  loss  and  are  the  dominant  factor  in 
speech  recognition  difficulties  of  listeners  with  hearing  impairment  (Baer  & Moore,  1993, 
1994;  Gagne  &.  Erber,  1987;  Lacroix  & Harris,  1979;  Moore  & Glasberg,  1993;  Moore, 
Glasberg,  & Vickers,  1995;  Needleman  & Crandell,  1996;  Plomp,  1978,  1986;  Walden, 
Schwartz,  Montgomery,  & Prosek,  1981;  and  Villchur,  1977). 

Another  way  to  organize  hearing  loss  simulation  studies  is  according  to  two 
methodological  characteristics:  1)  type  of  speech  stimuli  used  and  2)  presentation  of 
speech  stimuli  in  quiet  or  in  noise.  The  folllowing  discussion  will  review  representative 
studies  supporting  each  hypothesis,  within  the  structure  of  these  method  characteristics. 

Extensive  analysis  of  nonsense  syllable  recognition  by  listeners  with  normal  hearing 
under  simulated  hearing  loss  conditions  (“simulation  listeners”)  was  first  reported  by 
Wang,  Reed  and  Bilger  in  1978.  They  presented  vowel-consonant  and  consonant-vowel 
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syllables  in  quiet  under  twelve  filtering  conditions.  The  authors  found  that  phoneme  errors 
were  similar  to  those  of  listeners  with  hearing  impairment  at  various  frequencies. 

Therefore,  it  was  concluded  that  threshold  attenuation  was  the  major  factor  contributing 
to  speech  recognition  difficulties.  However,  they  did  note  that  simulation  listeners 
consistently  made  fewer  errors  than  listeners  with  hearing  impairment  that  approximated 
the  filter  conditions.  These  latter  data  would  suggest  that  some  other  factor(s)  besides 
threshold  attenuation  was  affecting  performance. 

Humes,  Dirks,  Bell,  and  Kincaid  (1987)  studied  the  effects  of  spectrally  shaped 
masking  noise  in  twelve  listeners  with  normal  hearing  and  compared  their  recognition  of 
nonsense  syllables  to  the  responses  of  four  listeners  with  hearing  impairment.  Divided  into 
four  groups  of  three  and  matched  with  one  listener  with  hearing  impairment,  the  masked 
thresholds  of  the  simulation  listeners  were  matched  within  + 3 dB  of  the  pure  tone 
thresholds  of  their  representative  listener  with  hearing  impairment.  Nonsense  syllable 
stimuli  were  presented  in  quiet  at  intensity  levels  of  56,  66,  and  76  dB  HL.  The  results 
indicated  no  difference  in  percent-correct  scores  between  listeners  with  hearing 
impairment  and  simulation  listeners.  In  fact,  one  of  the  listeners  with  hearing  impairment 
performed  better  than  any  of  the  simulation  listeners.  The  investigators  concluded  that  the 
primary  factor  in  speech  recognition  of  listeners  with  hearing  impairment  was  the  loss  of 
signal  audibility  or  of  pure  tone  sensitivity. 

However,  it  is  important  to  point  out  two  major  problems  with  the  conclusions 
made  by  Humes  et  al.  (1987).  First,  the  standard  for  each  noise  masked  group  was 
provided  by  only  one  subject  with  hearing  impairment.  Secondly,  and  more  importantly. 
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the  authors  did  not  use  a defined  phonetic  analysis  pocedure  to  determine  differences  in 
consonant  error  rates  between  listener  groups.  Also,  these  differences  were  found  across 
individual  nonsense  subsets  and  varied  with  presentation  level.  These  findings  would 
suggest  that  performance  is  affected  by  some  variable  or  variables  other  than  threshold 
attenuation.  Otherwise,  listener  responses  to  the  same  stimuli  would  have  minimal 
variation. 

In  1981,  Walden  et  al.  investigated  the  consonant  recognition  of  eight  listeners 
with  unilateral  hearing  impairment.  Hearing  loss  was  simulated  in  the  ear  with  normal 
hearing  to  match  the  audiometric  configuration  of  the  ear  with  hearing  impairment.  In 
addition,  six  listeners  with  normal  hearing,  bilaterally,  served  as  controls  who  heard  stimuli 
under  filtered  conditions.  Results  indicated  large  differences  (mean  of  20.8%)  and  a weak 
relationship  (r=.41)  in  consonant  recognition  between  the  ears  of  each  listener  with 
unilateral  hearing  loss.  However,  patterns  of  feature  recognition  were  similar.  According 
to  the  authors,  these  data  indicated  that  other  types  of  distortion  contributed  by  the 
hearing  impairment  were  imposed  on  the  spectral  shaping.  Also,  the  simulation  of  hearing 
loss  according  to  audiometric  thresholds  alone  did  not  adequately  simulate  individual 
consonant  recognition  or  overall  recognition  ability.  It  did  simulate  the  effects  of  hearing 
impairment  on  patterns  of  feature  recognition,  however. 

In  contrast,  Fabry  and  Van  Tassel  (1986)  used  a similar  paradigm  by  testing  six 
adults  with  unilateral  hearing  impairment.  Masking  and  filtering  procedures  were  used  in 
the  ear  with  normal  hearing  to  match  within  +3  dB  of  the  pure  tone  thresholds  in  the  ear 
with  hearing  impairment.  Consonant-vowel  syllables  were  presented  at  65  dB  SPL  in 
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quiet  to  the  ear  with  hearing  impairment  and  to  the  simulated  ear  (using  frequency  filtering 
or  noise  masking).  The  overall  results  indicated  consonant  recognition  performance  was 
similar  between  simulated  ears  and  ears  with  hearing  impairment  under  filtering  and 
masked  conditions.  The  authors  concluded  that  1)  the  effectiveness  of  using  filtering  or 
masking  to  simulate  hearing  loss  was  similar,  2)  the  type  of  recruitment  achieved  through 
masking  had  no  effect  on  suprathreshold  speech  perception  in  the  normal  ear  and, 
therefore,  3)  the  major  effect  of  sensorineural  hearing  impairment  on  speech  perception  is 
loss  of  signal  audibility  or  pure  tone  threshold  sensitivity. 

These  conclusions  are  brought  into  question  when  specific  results  are  considered. 
Comparing  percentage-correct  scores  between  simulated  ears  with  ears  with  hearing 
impairment,  only  two  of  six  listeners  had  similar  performance  under  filtering,  three  of  the 
six  listeners  had  comparable  performance  under  masking  and  one  listener  performed 
equally  under  both  conditions.  Specific  results  of  consonant  error  pattern  analysis 
revealed  that  four  of  six  listeners  demonstrated  similar  error  patterns  under  filtering,  three 
of  six  showed  similarity  under  masking  and  three  of  six  displayed  equal  similarity  under 
both  conditions.  Otherwise,  neither  simulation  condition  produced  similar  error  patterns 
between  simulated  ears  and  ears  with  hearing  impairment.  These  results  appear  to  support 
the  conclusions  that  filtering  and  masking  have  equal  and  similar  effectiveness  in 
simulating  hearing  loss.  However,  they  do  not  clearly  support  the  conclusion  that 
distortion  through  loss  of  audibility  or  threshold  attenuation  plays  the  primary  role  in 
speech  perception  difficulties  of  the  hearing  impaired.  This  conclusion  does  not  address 
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the  variability  and  assumes  that  the  filtering  and  masking  stimuli  affected  no  other  acoustic 
or  perceptual  parameter  other  than  signal  intensity. 

Zurek  and  Delhome  (1987)  reported  one  of  the  few  studies  which  presented 
speech  stimuli  in  noise  to  listeners  with  normal  hearing  already  experiencing  a degraded 
signal  through  hearing  loss  simulation.  Fifteen  listeners  with  hearing  impairment  were 
grouped  according  to  five  types  of  hearing  loss  in  terms  of  severity  and  configuration: 
mild  shallow-rising,  mild  steep-rising,  mild  falling,  moderate  rising,  and  moderate  flat. 
Fifteen  listeners  with  normal  hearing  were  grouped  similarly  with  hearing  loss  simulation 
to  match  one  of  the  hearing  loss  types.  The  speech  stimuli,  consisting  of  three  sets  of 
consonant-vowel  syllables  that  were  low  passed  filtered  at  4500  Hz,  were  presented  in  the 
background  of  speech  spectrum  noise  at  -20  dB  to  +20  dB  S/N  ratios.  No  differences  in 
consonant  recognition  performance  were  found  between  listeners  with  hearing  impairment 
and  simulation  listeners.  No  detailed  analysis  of  consonant  error  patterns  was  done, 
although  the  authors  reported  that  error  patterns  were  generally  similar  for  both  subject 
groups.  Therefore,  they  concluded  that  the  greatest  effects  on  speech  perception  was 
threshold  attenuation.  In  addition,  they  stated  that  amplification  would  restore  audibility 
and  provide  a satisfactory  speech  signal.  It  is  this  author's  opinion  that  this  conclusion  is 
not  strongly  supported  by  typical  events  that  occur  on  any  average  day  in  a hearing  aid 
dispensing  practice. 

Although  the  majority  of  the  above  studies  claim  to  support  the  hypothesis  that  a 
person’s  threshold  sensitivity  and  audibility  of  the  signal  are  the  dominant  factors  in 
speech  recognition,  the  variability  of  individual  subject  data  brings  this  conclusion  into 
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question.  There  is  also  an  assumption  that  speech  recognition  data  using  nonsense 
syllables  are  predictive  of  connected  speech  recogntion.  Critical  acoustic  information  is 
transmitted  when  the  sounds  are  combined  into  words  and  phrases  that  involve  intensity, 
frequency  and  time  (e  g,  transitional  and  coarticulation  spectral  cues,  rhythm  and 
inflection)  and  cognitive/  linguistic  parameters  (Boothroyd,  1993,  Ling,  1989). 

One  of  the  earliest  and  most  comprehensive  attempts  to  simulate  the  variety  of 
distortion  effects  related  to  sensorineural  hearing  loss  was  done  by  Villchur  (1974,  1977). 
Using  a 16-channel  system  with  computer  regulated  electronic  devices,  he  simulated  a 
severe,  high  frequency  sloping  hearing  loss  that  demonstrated  any  one  of  the  following 
aspects  of  hearing  impairment:  1)  loss  of  absolute  threshold  sensitivity,  2)  audiometric 
configuration,  3)  loudness  recruitment,  4)  reduced  dynamic  range  and  5)  reduced 
frequency  selectivity  and  discrimination.  Villchur  presented  PB-words  and  sentences  in 
quiet  at  94  dB  SPL,  in  white  noise,  and  in  speech  shaped  noise.  First,  the  performance 
scores  of  simulation  listeners  were  consistent  with  listeners  with  hearing  impairment  who 
had  similar  audiograms.  Second,  the  average  recognition  scores  for  both  speech  stimuli 
decreased  from  about  70%  in  quiet  to  50%  in  white  noise  to  less  than  38%  in  speech 
noise.  Although  audibility  was  not  fully  controlled,  Villchur  stated  that  severe  recruitment 
makes  it  difficult,  if  not  impossible,  to  make  the  weaker  parts  of  speech  audible  without 
the  stronger  parts  becoming  uncomfortably  loud  for  the  listeners  with  hearing  impairment. 
Third,  there  was  decreased  performance  when  frequency  selectivity  was  reduced. 
Unfortunately,  Villchur  did  not  report  quantitative  effects  of  combining  the  distortion 
variables  (Gagne  & Erber,  1987;  Moore,  1996). 
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Moore  and  Glasberg  (1993)  simulated  loudness  recruitment  and  hearing  loss  by 
adapting  Villchur’s  idea  that  loudness  recruitment  that  accompanies  the  loss  of  frequency 
sensitivity  disrupts  the  meaningful  relationships  between  sounds.  By  splitting  an  input 
signal  into  thirteen  frequency  bands  and  processing  each  band,  they  created  loudness 
sensations  in  the  listener  with  normal  hearing  that  resembled  loudness  recruitment  in  the 
listener  with  hearing  impairment.  Three  types  of  hearing  loss,  i.e.  flat  moderate,  flat 
severe  and  sloping  high  frequency,  were  simulated  and  speech  recognition  was  measured 
using  a speech-to-noise-ratio-threshold  (Plomp,  1978).  The  major  difference  of  this 
investigation  was  measuring  speech  recognition  with  and  without  linear  amplification  set 
according  to  Byrne  and  Dillon's  (1986)  revised  NAL  procedure.  This  procedure  was  an 
attempt  to  remove  the  attenuation  component  of  the  hearing  loss.  In  addition,  simulation 
listeners  heard  the  sentence  stimuli  at  a fixed  65  dB  SPL  against  a noise  background  of  a 
single  talker.  Findings  revealed  reduced  speech  recognition  in  quiet  and  high  speech 
recognition  in  quiet  with  amplification.  But  speech  recognition  was  greatly  reduced  using 
amplification  in  single-talker  noise,  especially  for  the  high  frequency  sloping  loss. 

Moore,  Glasberg  and  Vickers  (1995)  reported  a later  experiment  which  used  the 
same  methods  except  there  was  no  speech  shaped  noise  in  the  background.  This  time  the 
simulation  listeners  using  amplification  understood  speech  almost  as  well  as  listeners  with 
normal  hearing.  Moore  et  al.  explained  that  these  differences  between  types  of  noise  were 
related  to  the  inability  of  a listener  with  reduced  dynamic  range  to  utilize  the  spectral  and 
temporal  spaces  (referred  to  as  "dips")  occurring  in  a background  of  a single  competing 
talker.  The  speech  occurring  in  noise  doesn't  have  these  significant  spaces,  so  speech 
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recognition  is  more  dependent  on  the  higher  intensity  parts  of  speech  which  are  less 
affected  by  recruitment. 

Another  type  of  distortion  simulation  is  reducing  frequency  selectivity  by 
broadening  the  spectra  of  the  speech  stimuli  so  that  the  excitation  pattern  produced  in 
listeners  with  normal  hearing  resembles  the  pattern  seen  in  listeners  with  hearing 
impairment  (Moore,  1996).  It  is  theorized  that  this  type  of  signal  processing  simulates 
frequency  selectivity  in  terms  of  place  coding  in  the  auditory  system,  but  not  time  coding. 
Baer  and  Moore  (1993)  evaluated  recognition  of  key  words  in  sentences  in  quiet  and  in 
speech  noise  using  a speech  smearing  process.  They  found  no  significant  effect  in  quiet, 
even  when  simulated  excitation  patterns  were  six  times  broader  than  normal.  However, 
when  presented  in  noise  (0  dB  and  -3  dB  S/N  levels),  the  smearing  reduced  mean 
recognition  scores  to  an  accuracy  as  low  as  36%.  These  data  were  interpreted  as 
indicating  that  reduced  frequency  selectivity  by  spectral  smearing  significantly  contributes 
to  the  speech  recognition  difficulties  experienced  by  individuals  with  cochlear  damage 
when  listening  in  background  noise. 

A simulation  study  which  used  sentence  materials  in  noise  was  reported  by 
Needleman  and  Crandell  (1995).  This  investigation  had  two  unique  characteristics:  1) 
duration  of  sensorineural  hearing  loss  was  a control  variable  and  2)  an  adaptive  procedure 
to  determine  a SRT  was  used  for  both  nonsense  syllables  and  sentences.  The  competing 
noise  level  was  kept  at  75  dBSPL  and  the  speech  signal  was  varied  in  1 dB  steps  until  a 
50%  correct  score  was  obtained.  By  controlling  the  performance  criteria,  the  SRTs  could 
be  compared  between  listener  groups.  Ten  listeners  who  had  symmetrical  mild-to- 
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moderate  sensorineural  hearing  impairments  for  at  least  twenty  years  served  as  the 
comparison  group  for  simulation  listeners.  The  thresholds  of  ten  pairs  of  listeners  with 
normal  hearing  were  matched  to  each  listener  with  hearing  impairment  using  a spectrally- 
shaped  masking  noise.  The  high  predictability  sentences  of  the  revised  Speech  Perception 
in  Noise  (SPIN)  test  (Bilger,  1984;  Kalikow,  Steven,  & Elliott,  1977)  were  presented  in 
12-talker  babble.  The  results  indicated  no  significant  difference  (p>.05)  in  recognition  of 
nonsense  syllables  suggesting  that  audibility  is  the  dominating  factor  in  recognition. 
However,  the  mean  SRTs  for  sentences  were  significantly  different  (p>.01)  between 
simulation  listeners  and  listeners  with  hearing  impairment.  Needleman  and  Crandell 
concluded  that  these  data  suggest  that  distortion  factors  other  than  pure  tone  attenuation 
were  degrading  the  connected  speech  signal  for  listeners  with  hearing  impairment. 

The  studies  just  reviewed  support  the  hypothesis  that  distortion  factors  play  a 
major  role  in  recognition  of  suprathreshold  speech  recognition.  However,  these 
investigations  have  measured  the  recognition  of  single  and  connected  speech  units  by 
degrading  them  with  single  distorted  parameters.  They  have  not  studied  the  combined 
effects  of  these  distortions  on  speech  recognition.  Harris  wrote  in  1960  that  the  combined 
effects  of  various  distortion  factors  are  multiplicative  in  their  degradation  of  the  spoken 
speech  signal.  Data  are  lacking  about  the  combined  effects  and  interactions  of  distortions 
as  delineated  by  Levitt  (1982),  particularly  as  they  relate  to  connected  speech.  This  lack 
of  information  is  understandable  considering  the  complexity  of  the  problem,  the  previous 
limitations  of  equipment,  and  the  involvement  of  central  processes  and  language  variables. 
Villchur  (1974,  1977)  attempted  to  devise  the  equipment  to  simulate  the  major  aspects  of 
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sensorineural  hearing  loss.  In  the  late  1970’s,  Erber  designed  a hearing  loss  simulation 
system  patterned  after  Villchur’s  work  which  he  called  the  HELOS.  Erber  claims  that  the 
HELOS’  output  is  a result  of  combined  acoustic  distortions;  therefore,  the  device  will  be 
discussed  in  detail. 

Gagne  and  Erber  (1987)state  that  the  purpose  of  the  HELOS  was  to  have  an 
inexpensive,  simple  and  versatile  device  that  would  simulate  various  perceptual  aspects  of 
sensorineural  hearing  loss.  The  major  use  of  the  HELOS  is  clinical,  hearing  loss 
simulation  could  be  used  for  the  training  and  counseling  of  professionals  and  the  friends 
and  family  members  of  persons  with  a sensorineural  hearing  loss.  The  following 
description  summarizes  design  and  electroacoustic  outputs  of  the  laboratory  prototype.  It 
is  important  to  know  that  the  commercially  available  HELOS  devices  are  individually 
constructed  by  Erber  and  do  not  have  accompanying  electronic  specification  information. 
According  to  the  authors,  the  HELOS  is  designed  to  simulate  the  following  perceptual 
characteristics  of  sensorineural  hearing  loss:  a)  loss  of  threshold  sensitivity,  b) 
audiometric  configuration,  c)  loudness  recruitment,  d)  reduced  dynamic  range,  and  e) 
reduced  frequency  selectivity. 

Threshold  attenuation  and  audiometric  configuration  were  simulated  by  first 
dividing  the  input  signal  into  two  independent  frequency  channels,  using  a 500  Hz  low- 
pass  filter  and  a 500  Hz  high-pass  filter.  Prior  to  output,  each  channel  was  equipped  with 
a center-clipping  device  that  removed  the  middle  portion  of  the  speech  wave  and 
preserved  the  amplitude  peaks,  thus  allowing  thresholds  of  detection  to  be  varied  within 
each  channel  independently.  Gagne  and  Erber  stated  that  the  amount  of  center-clipping 
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allowed  detection  thresholds  to  be  adjusted  over  a range  of  60  dB,  and  to  simulate  several 
audiometric  configurations. 

Erber  states  that  another  result  of  center-clipping  is  the  simulation  of  loudness 
recruitment.  This  claim  is  based  on  his  observations  that  the  RMS-voltage  input-output 
functions  from  the  center-clipping  circuits  for  various  threshold  settings  were  very  similar 
to  loudness  recruitment  curves  obtained  from  listeners  with  hearing  impairment. 

Therefore,  as  the  increase  of  center-clipping  simulated  greater  degrees  of  hearing  loss, 
there  was  a systematic  increase  (more  steep)  in  loudness  growth  functions. 

Reduced  dynamic  range  was  simulated  in  the  HELOS  by  introducing  a noise, 
which  was  referred  to  as  “aversive”.  A broadband  masking  noise  occurred  when  the  input 
signal  exceeded  a preselected  (adjustable)  intensity  level.  The  intensity  level  of  the  noise 
stimulus  was  such  that  a listener  perceived  a signal-to-noise  ratio  poorer  than  -20  dB. 
Gagne  and  Erber  acknowledged  that  this  simulation  strategy  did  not  replicate  what  is 
actually  heard  by  listeners  with  sensorineural  hearing  loss.  However,  it  provided  an 
unpleasant  listening  experience  and  was  judged  to  affect  speech  perception  in  the  same 
manner  that  a sensorineural  hearing  loss  would  under  similar  listening  conditions. 

Finally,  an  electronic  device  called  an  analog  delay  line  was  used  to  simulate  the 
hearing  loss  characteristic  of  reduced  frequency  selectivity.  Typically,  an  analog  delay  line 
has  been  used  to  simulate  reverberation  time.  In  the  HELOS,  this  device  was  modified  so 
that  the  clock  producing  the  delay  was  driven  by  a low-pass  filtered  noise  instead  of  the 
periodic  signals  more  typically  used.  The  result  of  modulating  the  delay  time  by  the  noise 
was  random  delays  which  distorted  the  signal’s  phase  or  “jittered”  the  signal  waveform 
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randomly  in  time,  the  bandwidth  of  an  audio  signal  increased  as  the  amount  of  “jittering” 
was  increased.  The  authors  observed  that  listeners  reported  a “less  clear”  or  “more  fuz2y” 
(p.  234)  message.  He  equated  this  perception  to  the  difficulty  that  listeners  with 
sensorineural  hearing  loss  have  in  resolving  spectral  components  of  a complex  acoustic 
signal  and  discriminating  between  sounds. 

The  HELOS  was  tested  using  ten  young  listeners  with  normal  hearing.  Five 
listening  conditions  were  simulated:  1)  no  signal  changes  (control),  2)  center  clipping  to 
produce  sloping  to  severe  attenuation  in  high  frequencies,  3)  center  clipping  to  produce 
severe  flat  attenuation  with  no  frequency  “jittering”,  4)  center  clipping  to  produce  severe 
flat  attenuation  with  frequency  “jittering”;  and  5)  center  clipping  to  produce  profound 
attenuation  with  no  frequency  “jittering”.  Pure  tone  thresholds,  PB-word  recognition 
scores,  and  vowel  and  consonant  syllable  (/b/  V /b/,  VCV  ) identification  scores  were 
obtained.  Mean  word  recognition  scores  ranged  from  about  30%  to  2%  in  the  hearing 
loss  condition  with  individual  scores  ranging  from  about  65%  to  0%.  The  greatest 
variability  was  seen  for  Condition  4,  severe  attenuation  with  all  types  of  distortions 
present. 

Gagne  and  Erber  (1987)  reported  that  vowel  and  consonant  identification  errors 
made  by  the  simulation  listeners  were  similar  to  those  reported  by  previous  investigators 
for  listeners  with  hearing  impairment  and  simulation  listeners  (Bilger  & Wang,  1976;  Hack 
& Erber,  1982;  Owens  et  al,  1972;  Sher  & Owens,  1974).  Average  scores  for  vowel 
identification  ranged  from  72.8%  to  77.8%  for  Conditions  2-4.  In  general,  the  tendency 
of  listeners  to  respond  with  a back  or  neutral  vowel  increased  as  attenuation  and  other 
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distortions  increased.  Response  patterns  were  most  random  for  Condition  4,  severe 
attenuation  and  frequency  jittering. 

Correct  identification  of  consonants  ranged  in  average  scores  from  36.4%  to 
58.3%,  decreasing  with  increasing  signal  degradation.  Place  of  articulation  errors  were 
most  prevalent  when  high  frequency  cues  were  removed  (Condition  2).  As  attenuation 
increased  and  frequency  distortions  were  included,  errors  in  manner  of  articulation  (i.e., 
voiced/voiceless  phonemes)  were  observed  along  with  errors  in  place  of  articulation 
confusions.  Again,  responses  became  more  random  when  frequency  distortion  was 
included.  Accuracy  and  error  patterns  for  vowels  and  consonants  (34.8%  and  19.4%, 
respectively)  that  were  observed  for  the  most  profound  hearing  loss  (Condition  5)  were 
typical  of  listeners  with  profound  hearing  impairment. 

Gagne  and  Erber  (1977)  concluded  that  the  HELOS  system  was  successful  in 
simulating  the  various  aspects  of  sensorineural  hearing  loss.  It  is  important  to  note  that 
this  conclusion  is  based  on  comparing  their  data  to  previously  reported  psychoacoustic 
and  speech  reception  data,  not  on  the  performance  of  control  listeners  with  hearing 
impairment.  To  date,  no  further  data  has  been  published  on  the  HELOS  in  terms  of 
specific  electroacoustic  output  or  speech  recognition  performance  of  simulation  listeners. 
Only  data  related  to  using  the  HELOS  as  an  audiologic  management  tool  has  been 
published  (Erber,  1995). 

Based  on  the  reviewed  literature,  it  is  still  unclear  which  combined  acoustic 
distortions  under  what  listening  conditions  have  the  greatest  effect  on  speech  recognition. 
In  addition,  the  studies  have  reported  only  measures  of  recognition  for  nonsense  syllables. 
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monosyllabic  words  and  isolated  sentences.  Recognition  of  passages  or  connected  speech, 
which  would  most  closely  relate  to  everyday  conversation,  has  not  been  reported  in  the 
hearing  loss  simulation  literature.  Therefore,  this  study  will  investigate  the  combined 
effects  of  threshold  attenuation  and  signal  distortions  on  the  recognition  of  three  types  of 
speech  stimuli:  nonsense  syllables,  monosyllabic  words  and  connected  speech  passages, 
presented  in  quiet  and  noise  listening  conditions. 

Experimental  Hypotheses 

1 . There  is  no  significant  difference  (p>.05)  in  the  recognition  of  speech  stimuli  when 
listening  through  two  threshold  settings  on  the  HELOS  while  keeping  slope  and 
distortion  settings  constant. 

2.  There  is  no  significant  difference  (p>.05)  in  the  recognition  of  speech  stimuli  when 
listening  through  three  distortion  settings  on  the  HELOS  while  keeping  slope  and 
threshold  settings  constant. 

3.  There  is  no  significant  interaction  (p>.05)  between  experimental  threshold  and 
distortion  settings  on  the  HELOS  as  indicated  by  recognition  of  speech  stimuli. 

4.  There  is  no  significant  difference  (p>.05)  in  the  recognition  of  speech  stimuli  presented 
in  quiet  or  in  background  noise  when  listening  through  an  experimental  setting  on  the 
HELOS. 

5.  There  is  no  significant  correlation  (p>.05)  between  percentage  recognition  of  three 
speech  stimuli:  consonants,  monosyllabic  words,  and  connected  speech  passages  under 
a specific  experimental  listening  condition. 
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Experimental  Question 


What  are  the  electroacoustic  characteristics,  i.e.  , intensity,  frequency  bandwidth  and 
phase,  of  the  experimental  settings  on  the  HELOS?. 


CHAPTER  3 
METHODS 


The  methodology  used  for  the  experimental  study  was  determined  according  to 
information  from  the  reviewed  literature  and  pilot  research.  This  chapter  describes  the 
subjects  and  procedures  used  for  the  pilot  and  experimental  investigations.  Also,  the  type 
of  speech  stimuli,  equipment,  and  methods  of  data  analysis  that  were  used  are  described. 

Subjects 

Research  subjects  were  recruited  from  undergraduate  communication  disorders 
classes  by  the  instructor  or  researcher  announcing  the  opportunity  in  class.  Although 
participation  was  voluntary,  69  of  the  73  students  tested  received  school  related  credit, 
i.e.,  extra  credit  points,  observation  hours,  or  service  hours  for  a student  professional 
organization.  Recruiting  criteria  included:  1)  being  between  ages  18  and  30  years,  2) 
being  a native  speaker  of  English,  and  3)  reporting  hearing  to  be  within  the  normal  range. 
A power  analysis  of  pilot  data  determined  that  60  subjects  were  required  for  the  research 
design.  To  obtain  this  number,  73  students  were  tested.  Thirteen  individuals  did  not 
complete  the  experiment.  Six  persons  did  not  meet  experimental  criteria  for  pure  tone 
thresholds  and/or  no  history  of  middle  ear  disease,  one  person  did  not  learn  English  until 
age  eight,  three  persons  canceled  or  did  not  show  for  the  scheduled  experimental  session 
and  three  persons  were  dismissed  due  to  the  researcher’s  procedural  errors 
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All  60  young  adult  subjects  were  between  age  18  and  28  years  with  a mean  age  of 
21  years.  Through  a personal  interview,  all  subjects  reported  to  be  native  speakers  of  the 
English  language,  have  no  known  history  of  recurrent  ear  disease  or  placement  of  pressure 
equalization  tubes,  or  tinnitus.  In  addition,  they  reported  no  past  difficulty  with  learning 
phonics  or  following  multiple  spoken  commands. 

All  subjects  had  bilateral  pure  tone  thresholds  of  20  dB  HL  or  better  at  all 
frequencies  from  250  Hz  to  8000  Hz,  including  inter-octaves  (except  750  Hz).  All 
subjects  had  speech  recognition  thresholds  (SRT)  of  0 to  10  dB  HL  (mean  of  2 dB  in  the 
right  and  in  the  left  ear).  Word  recognition  using  recorded  Maryland  CNC  Lists  1 and  2 
were  94%  or  better.  A presentation  level  of  56  dB  HL  was  used  in  order  to  be  consistent 
with  presentation  levels  of  experimental  speech  stimuli.  The  ear  that  had  the  lowest 
thresholds  and  better  speech  scores  overall  was  chosen  to  be  the  experimental  ear. 

In  order  to  screen  for  central  auditory  processing  disorders,  subjects  were  given  a 
modified  version  of  the  Synthetic  Sentence  Identification  (SSI)  test.  Two  sets  often 
sentences  were  presented  to  the  experimental  ear  with  the  Ipsilateral  Competing  Message 
(SSI-ICM)  and  one  set  of  ten  sentences  was  presented  with  the  Contralateral  Competing 
Message  (SSI-CCM).  The  stimuli  were  presented  at  80  dB  HL  with  a 0 dB  signal-to- 
competing  message  ratio  (Jerger,  1975).  All  subjects  reached  criteria  of  90%  or  better  for 
both  conditions. 

In  addition,  all  subjects  were  screened  for  middle  ear  function  with  a portable 
immittance  screener  that  indicated  “Pass”  or  “Fail”  for  measures  of  middle  ear  pressure, 
compliance  and  gradient . All  experimental  subjects  passed  for  both  ears.  Ipsilateral 
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acoustic  reflexes  were  present,  bilaterally,  with  a 1000  Hz  and  2000  Hz  pure  tone 
presented  at  95  dB  HL.  A summary  of  subject  characteristics  can  be  found  in  Table  1. 

Each  subject  was  seen  for  a one-hour  appointment  to  determine  if  he  or  she  met 
the  experimental  selection  criteria.  If  the  individual  met  the  selection  criteria  and  agreed 
to  continue  with  informed  consent,  a two-hour  experimental  listening  session  was 
scheduled  within  a seven  day  period.  Whenever  the  interval  between  sessions  was  longer 
than  seven  days,  hearing  thresholds  were  rechecked  to  insure  that  change  had  not 
occurred  due  to  physical  health  or  activities.  No  rechecked  thresholds  varied  by  more 
than  4 dB. 

Experimental  Speech  Stimuli 

Three  types  of  speech  stimuli  were  used,  monosyllablic  words,  consonant-vowel- 
consonant  (CVC)  nonsense  syllables,  and  contextual  speech  passages.  Each  set  of  stimuli 
was  chosen  using  the  following  three  criteria:  1)  supporting  psychometric  data,  2) 
frequency  of  use  in  typical  clinical  evaluations  in  order  to  increase  possibility  of  clinical 
application  of  data  and  3)  concerns  related  to  availability,  cost,  and  required  presentation 
time. 

The  Northwestern  University  word  lists  (NU-6)  originally  developed  by  Tillman 
and  Carhart  (1966)  were  chosen  because  of  the  number  of  lists  and  randomizations 
available,  the  high  inter-test  reliability  of  the  lists  and  their  popular  use  (Martin,  1994). 

The  version  used  in  this  study  was  the  audio  compact  disc  recording  of  a female  talker 
produced  by  the  Department  of  Veterans  Affairs  in  1991.  Lists  lb  through  4b  were  re- 
recorded onto  channel  one  of  digital  audio  tapes  (DAT).  The  recording  included  a 1000 


44 


Hz  calibration  tone.  The  six-talker  babble  noise  from  the  Connected  Speech  Test  or  CST 
(Cox,  Alexander,  and  Gilmore,  1987)  was  re-recorded  onto  channel  two.  Further 
explanation  of  the  CST  is  found  below.  This  recording  included  a calibration  signal  of 
white  noise.  Intensity  levels  of  the  two  calibration  signals  were  measured  by  a sound  level 
meter  to  insure  that  presentation  levels  were  accurate. 

A modified  version  of  the  Iowa  Medial  Consonant  Test  (IMCT)  was  used  to 
evaluate  perception  of  consonants  under  each  experimental  condition.  The  IMCT  is  a 
subtest  of  the  Iowa  Test  Battery  for  cochlear  implant  users  and  is  available  with  a female 
and  a male  talker.  In  order  to  shorten  administration  time,  this  test  was  limited  to 
investigating  consonants  within  a single  vowel  context,  /i/.  The  IMCT  consists  of  the 
major  twenty-four  consonants  in  spoken  English  in  a vowel-consonant-vowel  (VCV) 
format,  i.e.,  /b,  1 5 , d,  f,  g,  h,  d3  , k,  1,  m,  n,^  , p,  r,  s,  ^ , t,  0,  ^ , v,  w,  J,  z,  3 ).  Although 
the  IMCT  is  a closed  set  test,  the  number  of  foils  available  (23)  for  each  stimulus  item  is 
large  enough  to  decrease  the  probability  for  chance  responses.  In  order  to  have  four  lists 
or  randomizations  of  96  items  (four  occurrences  of  each  consonant).  List  1 with  a female 
talker  was  dubbed  onto  channel  one  of  a DAT  from  the  laserdisc  original  recording  (Tyler, 
et  al.,  1986).  The  six-talker  babble  from  the  laserdisc  recording  was  dubbed 
simultaneously  onto  channel  two  of  the  DAT.  A five  second  inter-stimulus  interval  was 
used  during  which  there  was  no  audio  signal.  The  recording  included  a 1000  Hz 
calibration  tone  for  both  the  stimulus  and  the  noise  tracks.  See  Appendix  A for  an 


example  list  of  the  IMCT. 


45 

Contextual  speech,  the  third  type  of  speech  stimuli,  was  provided  by  the 
Connected  Speech  Test  (CST)  developed  by  Cox,  Alexander  and  Gilmore  (1987). 
According  to  Cox,  et  al.,  the  primary  purpose  for  developing  the  CST  was  to  produce  “a 
test  of  intelligibility  of  everyday  speech  ...  for  the  specific  purpose  of  serving  as  a 
criterion  measure  of  hearing  aid  benefit”  (p.  1 19S).  The  authors  stated  that  this  type  of 
test  would  relate  more  closely  to  the  communication  interactions  that  listeners  encounter 
daily  than  would  the  monosyllabic  tests  typically  administered  during  a hearing  evaluation. 

The  CST  consists  of  48  test  passages  and  9 practice  passages.  Each  passage  is 
about  a particular  topic  and  consists  of  ten  sentences  with  each  sentence  being  seven  to 
ten  words  in  length  (see  Appendix  B for  examples).  According  to  Cox,  et  al.,  syntax  and 
word  familiarity  were  strictly  controlled  by  using  only  simple  grammatical  forms  and  basic 
vocabulary  from  a children’s  educational  reading  source.  Each  passage  contains  twenty- 
five  key  words.  Five  levels  of  difficulty  or  intelligibility  are  represented  by  five  words 
each.  Approximately  one-half  of  the  key  words  are  monosyllabic,  one-third  are  bi-syllabic, 
and  the  remaining  words  are  multi-syllabic.  The  percentages  of  phoneme  occurrence 
according  to  phonetic  categories  of  the  key  word  consonants  are  similar  to  conversational 
data  reported  by  Fletcher  (1953).  A listener’s  intelligibility  score  is  the  accurate  repetition 
of  the  key  words  within  four  CST  passages.  The  CST  is  recorded  on  laserdisc  using  a 
female  talker.  On  a second  audio  channel,  a six-talker  babble  is  recorded  at  the  average 
level  (dB  Leq)  of  the  passages.  Whenever  the  stimulus  is  stopped,  the  listener  hears  no 
audio  signal.  A calibration  signal  of  white  noise  is  provided. 
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The  development  of  the  CST  was  based  on  rigorous  psychometric  procedures. 

The  female  talker  was  selected  by  an  investigation  of  six  talkers’  speech  intelligibility.  Key 
words  were  determined  based  on  their  average  and  range  of  intelligibility;  those  words  that 
were  correctly  repeated  more  than  95%  or  less  than  7%  of  the  time  were  eliminated. 
Difficulty  of  passages  was  found  to  be  equal  across  passages  and  across  signal-to-babble 
(S/B)  ratios  with  correlation  coefficients  ranging  from  .82  to  .93  (p<.01).  In  addition,  best 
correlations  of  transformed  scores  with  true  scores  indicated  that  a person’s  performance 
for  one  CST  passage  would  be  an  accurate  estimate  of  an  individual’s  true  score  for 
connected  speech  stimuli.  As  a result,  forty-eight  passages  with  correlation  coefficients 
of  .88  or  more  were  chosen  as  test  passages,  with  nine  passages  (r=.79-.87)  chosen  for 
practice  passages.  Finally,  critical  differences  and  performance-intensity  functions  are 
provided.  The  CST  is  the  only  published  conversational  speech  test  that  provides 
normative  and  reliability  data  for  listeners  with  normal  hearing  (Cox,  et  al.,  1987)  and  with 
hearing  impairment  (Cox,  et  al.,  1988).  Cox  and  her  colleagues  (1987)  concluded  that 
“the  test  is  sensitive  to  small  changes  in  S/B  ratio  so  that  a small  change  in  conditions 
produces  a large  change  in  scores”  (p.  125S). 

General  Equipment 

All  audiometric  procedures  were  done  using  a Grason  Stadler  (model  16) 
audiometer  with  the  signal  being  delivered  via  TDH-50  earphones.  The  audiometer 
received  a commercial  calibration  within  six  months  of  subject  selection.  A Quest  sound 
level  meter  (model  215)  was  used  to  perform  a calibration  check  of  pure  tones  and 
auxiliary  channels  through  the  TDH-50  earphones  at  the  beginning,  midpoint  and  end  of 
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the  experiment.  Listening  checks  were  performed  at  the  beginning  of  each  day  of  data 
collection.  Subjects  listened  to  all  tasks  while  sitting  in  a double-walled  lAC  sound  booth. 

Monosyllabic  and  nonsense  syllable  stimuli  were  delivered  to  the  audiometer  by  a 
Sony  Digital  Audio  Tape  (DAT)  player  (model  59ES).  A Pioneer  LaserVision  laser  disc 
player  (model  LD-V2200)  delivered  the  CST  passages  to  the  audiometer.  A television 
was  used  to  view  the  destination  frame  numbers  and  to  monitor  the  stimulus. 

For  the  listening  conditions  in  which  the  signal  was  distorted,  the  stimuli  were 
routed  from  the  audiometer  to  the  earphones  through  the  HELOS,  a portable  hearing  loss 
simulator  (Gagne  & Erber,  1987).  The  HELOS  has  five  main  controls:  1)  a gain  control, 
2)  a mode  button  (a  mild-moderate-severe  hearing  loss  versus  a profound  hearing  loss),  3) 
a threshold  attenuation  control,  4)  a distortion  control,  and  5)  a frequency  slope  switch 
(flat  versus  sloping  high  frequency).  The  mode  button  was  set  for  the  lesser  hearing  loss 
and  the  slope  switch  was  set  on  “flat”.  The  threshold  and  distortion  controls  are 
continuous,  with  each  having  setting  numbers  one  through  ten  marked  on  the  dial.  For 
this  investigation  two  threshold  and  three  distortion  dial  settings  were  used:  Tl,  T2,  Dl, 
D3,  and  D5.  To  avoid  confusion,  the  distortion  dial  settings  are  referred  to  as  Dl,  D2, 
and  D3,  respectively,  for  the  experimental  conditions.  Line  markers  or  “ticks”  were  made 
on  the  dials  to  increase  dial  setting  reliability;  the  experimenter  practiced  placement  with  a 
sound  level  meter  measurement. 

Electroacoustic  characteristics  of  the  HELOS  for  these  experimental  settings  were 
measured  in  terms  of  intensity,  bandwidth,  and  period  or  timing  shifts  using  a Tucker 
Davis  Technologies  System  II.  This  instrumentation  is  a 16  bit  digital-to-analog/analog- 
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to-digital  converter.  The  System  II  provides  an  array  processor  operating  system  that 
performs  the  mathematical  operations  of  a Fast  Fourier  Transform  (FFT)  algorithm.  The 
computer  software  allowed  for  quantitative  analysis  and  a visual  display  of  the  signal 
according  to  frequency,  amplitude,  time,  and  phase. 

Immittance  procedures  were  conducted  using  a Maico  (model  MA630)  portable 
immittance  bridge.  A biological  calibration  was  done  daily.  The  experimenter  used  a 
hand-held  otoscope  to  view  the  subjects’  ear  canals  to  verify  that  they  were  unoccluded 
and  that  their  eardrums  could  be  visualized. 

Procedures 


Pilot  Study 

Preliminary  to  the  onset  of  the  experiment,  a pilot  study  was  conducted  in  order  to 
determine  the  following:  1)  which  HELOS  dial  settings  to  investigate,  2)  what  signal-to- 
noise  (S/N)  ratios  to  use  for  each  speech  stimuli  and  3)  how  many  subjects  were  needed  to 
provide  appropriate  data.  It  was  found  that  only  threshold  dial  settings  1 and  2 maintained 
an  audible  signal  for  average  conversational  speech.  Distortion  settings  6 through  10  were 
found  to  produce  signal  distortion  to  the  degree  that  performance  floor  effects  occurred. 
Therefore,  the  experimenter  determined  that  distortion  dial  settings  1,3,  and  5 provided  a 
workable  range  of  distortion  of  the  signal. 

Volunteer  subjects  for  the  pilot  study  were  friends,  relatives  and  classmates  in 
classes  for  communication  disorders.  Subjects  met  audiometric  selection  criteria,  but 
ranged  in  age  from  18  to  36  years.  Twenty-three  individuals  listened  to  one  or  two  of  the 
speech  tasks  in  quiet  (56  dB  ITL)  and  in  noise  that  varied  in  intensity  by  2 dB  steps.  The 
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S/N  ratio  for  each  speech  task  was  determined  by  varying  the  noise  signal  intensity  until  a 
listener’s  score  was  obtained  between  75  and  80  percent.  This  criterion  was  used  to  avoid 
floor  and  ceiling  effects  in  performance  scores  in  the  earphone  plus  noise  condition. 

Based  on  these  data,  a +8  dB  S/N  ratio  was  used  for  the  NU-6  word  lists  and  for  the 
IMCT,  and  a +2  dB  S/N  ratio  was  used  for  the  CST. 

The  experimental  procedure  was  administered  to  eleven  pilot  subjects.  These  data 
underwent  a statistical  power  analysis  to  determine  the  necessary  number  of  subjects  for 
the  research  design.  A minimum  of  sixty  subjects  was  recommended. 

In  addition,  observations  were  made  that  related  to  presentation  order  of  speech 
tasks  and  listening  conditions,  required  presentation  times  and  variables  such  as  subject 
fatigue.  It  was  determined  that  experimental  subjects  would  be  seen  over  two 
appointments,  one  hour  for  selection  and  HELOS  pure  tone  threshold  testing  and  two 
hours  for  experimental  testing.  To  prevent  repeating  the  same  task  in  succession,  the 
order  of  presentation  of  the  three  speech  tasks  would  remain  constant  within  any  one 
experimental  listening  condition  for  each  subject.  The  order  of  NU-6,  IMCT  and  CST 
was  judged  to  be  the  most  helpful  in  maintaining  subject  attention. 

Experimental  Study 

Sixty  subjects  were  assigned  randomly  to  six  groups  of  ten  subjects  each.  Each 
group  was  assigned  to  one  of  six  HELOS  experimental  settings:  1)  TlDl,  2)  T1D2, 
3)T1D3,  4)T2D1,  5)  T2D2,  and  6)T2D3.  Each  subject  was  administered  each  speech 
test  monaurally  under  four  listening  conditions.  These  listening  conditions  were  the 
following;  1)  in  quiet  (minus  HELOS  quiet  or  —HQ),  2)  in  noise  (minus  HELOS  noise 
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or  — HN),  3)  HELOS  in  quiet  (plus  HELOS  or  +HQ)  and  4)  HELOS  in  noise  (plus 
HELOS  or  +HN).  See  Figure  1.  The  order  of  these  four  conditions  was  randomized 
across  subjects.  Also,  subjects  were  warned  or  reminded  before  each  listening  task 
whether  they  would  hear  the  stimuli  through  earphones  alone  or  through  earphones  with 
the  HELOS,  and  with  or  without  the  “cocktail  party.” 

During  the  initial  selection  testing,  pure  tone  thresholds  were  obtained  at 
frequencies  250  Hz  to  8000  Hz,  including  3000  Hz  and  6000  Hz,  using  an  adaptive 
method  of  limits  with  a 2 dB  interval.  Threshold  was  determined  as  the  lowest  intensity 
level  that  elicited  three  responses  on  ascending  presentation.  A speech  reception  threshold 
(SRT)  was  defined  as  the  lowest  intensity  level  at  which  three  or  more  of  six  words  were 
recognized  correctly.  Words  were  presented  via  monitored  live  voice  using  a 2 dB 
ascending  technique.  Subjects  were  familiarized  with  a written  list  of  alphabetized 
spondee  words  before  the  initial  threshold  testing.  Thresholds  for  each  frequency  and 
speech  were  similarly  obtained  for  each  subject  listening  under  the  assigned  HELOS 
experimental  setting 

During  each  experimental  HELOS  condition,  the  presentation  order  of  speech 
materials  remained  constant  across  subjects  for  each  of  the  four  listening  conditions,  i.e., 

1)  NU-6  words,  2)  IMCT  syllables,  and  3)  CST  paragraphs.  For  example.  Subject  1 was 
assigned  randomly  the  listening  conditions  in  the  order  of  -HN,  +HQ,  +HN,  -HQ,  and 
Subject  2 was  assigned  randomly  the  listening  conditions  in  the  order  of  +HN,  +HQ, 

-HQ,  -HN.  For  his  or  her  first  assigned  listening  condition,  each  subject  would  be 
presented  each  speech  test  in  the  order  of  NU-6  words,  EMCT  syllables,  and  CST 
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passages.  Upon  completion  of  that  condition,  each  subject  would  be  presented  the  speech 
tests  again  in  the  same  order  under  his  or  her  second  assigned  listening  condition.  The 
same  procedure  was  done  for  the  third  and  fourth  listening  condition. 

All  speech  stimuli  were  presented  at  56  dB  HL  because  the  CST  was  standardized 
using  that  intensity  level  and  a consistent  intensity  level  was  desired  across  speech 
materials.  The  normal  fluctuations  of  the  speech  envelope  have  a range  of  30  to  40  dB 
which  is  heard  easily  by  listeners  with  normal  hearing  (Ling,  1989).  Therefore  by 
providing  a suprathreshold  intensity  level  similar  to  everyday  listening  levels,  any 
attenuation  of  sounds  would  be  due  to  the  HELOS  and  not  to  the  normal  fluctuations  of 
speech.  As  discussed  previously,  the  intensity  levels  of  noise  stimuli  were  determined  by 
pilot  data.  The  S/N  ratios  were  +8  dB  (48  dB  HL)  for  the  NU-6  words  and  IMCT 
syllables  and  +2  dB  (54  dB  HL)  for  the  CST  paragraphs.  Before  presentation  of  each 
speech  test,  the  signal  was  calibrated  with  the  audiometer  VU  meter  set  to  0 using  the 
calibration  tone  that  accompanied  the  recording. 

The  NU-6  word  test  was  introduced  to  each  subject  with  the  following 
instructions;  “You  will  hear  a female  voice  telling  you  to  say  a word,  such  as  in  ‘You  will 
say  man.’  Please  write  the  word  down  on  this  answer  sheet.  Try  to  write  a response  each 
time,  even  if  it  is  a guess.  You  will  have  just  enough  time  to  write  a word  before  she  says 
the  next  one.”  A full  list  of  fifty  words  was  presented.  The  order  of  lists  was  randomized 
across  subjects  and  across  listening  conditions  for  each  subject.  A percentage  correct 


score  was  recorded. 
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The  IMCT  was  first  introduced  to  subjects  through  a familiarization  listening  task 
under  earphone  quiet  conditions.  The  subject  was  provided  an  alphabetized  list  of  the 
consonant  sounds  (Appendix  A)  and  told  to  listen  to  a recorded  female  voice  saying  each 
one,  with  the  consonant  surrounded  by  the  vowel  /i/.  He  or  she  was  requested  to  repeat 
each  syllable  after  the  speaker.  It  was  explained  that  the  purpose  of  this  task  was  to 
familiarize  him  or  her  with  the  stimulus  items  and  to  allow  the  transciber  to  become 
familiar  with  the  subject’s  articulation.  The  list  was  removed  before  the  test  began.  For 
the  listening  condition,  each  subject  was  given  the  following  instructions:  “You  will  hear 
the  female  voice  say  these  syllables  all  mixed  up.  There  will  be  96  presentations.  Just 
relax  and  say  what  sound  you  think  that  you  heard.  I (or  an  assistant)  will  sit  in  the  booth 
with  you  and  transcribe  the  sounds  you  say.  Please  guess  if  you  are  not  sure.”  The  order 
of  lists  was  randomized  across  subjects  and  across  listening  conditions  for  each  subject.  A 
percentage  correct  score  was  recorded. 

The  Connected  Speech  Test  (CST)  was  presented  according  to  procedures 
specified  by  Cox,  et  al.  (1987).  Subjects  were  instructed  as  follows:  “You  will  hear  a 
female  voice  speaking  a short  passage  about  a specific  topic.  I will  always  tell  you  the 
topic  before  starting.  After  each  short  sentence,  I will  stop  the  tape.  Please  repeat  back 
what  you  heard  as  verbatim  as  possible.  Repeat  back  whatever  you  heard,  even  if  it  is 
only  words  that  don’t  make  sense.  It  is  okay  to  guess.  The  first  passage  is  practice.  Then 
you  will  will  hear  four  test  passages.”  Only  exact  responses  of  key  words  were  scored  as 
correct,  resulting  in  a score  of  total  key  words  repeated  accurately  over  the  four  test 
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passages.  Practice  passages  and  groups  of  four  successive  test  passages  were  randomized 
across  subjects. 

All  audiometric  testing  and  data  collection  were  conducted  by  a certified 
audiologist  with  over  fifteen  years  of  diagnostic  experience.  Two  undergraduate  assistants 
who  had  successfully  completed  a phonetics  course  transcribed  the  IMCT  for  13  subjects 
(22%),  with  the  audiologist  transcribing  the  remaining  number  47  subjects  (78%). 
Reliability  of  transcriber  judgment  of  subject  responses  was  checked  by  the  simultaneous 
scoring  by  a second  transcriber  for  at  least  one  listening  condition.  These  checks  were 
completed  randomly  ten  times.  Point  by  point  agreement  analysis  revealed  an  agreement 
range  of  96.8  to  100%  with  a mean  agreement  of  98.5%.  See  Appendix  C for  reliability 
data.  In  addition,  accuracy  of  scoring  responses  was  checked  by  randomly  choosing  and 
rescoring  one  presentation  each  of  the  monosyllable  and  nonsense  syllable  tests  for  each 
subject.  Only  five  errors  were  found  in  the  120  lists  checked. 

Data  Analysis 

This  study  investigated  three  independent  variables:  1)  threshold  attenuation,  2) 
distortion  and  3)  absence  or  presence  of  noise.  Two  threshold  conditions,  T1  and  T2,  and 
three  distortion  conditions,  Dl,  D2,  and  D3,  were  used.  For  each  T and  D combination, 
stimuli  were  heard  in  the  absence  or  presence  of  noise.  Therefore,  there  were  six 
experimental  conditions  with  the  following  designations:  TlDl,  T1D2,  T1D3,  T2D1, 
T2D2,  T2D3.  Ten  subjects  were  assigned  to  each  experimental  condition.  Within  each  of 
these  experimental  conditions,  there  were  four  listening  conditions  designated  as  —HQ, 
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— HN,  +HQ,  and  +HN  (minus  HELOS  in  quiet  and  in  noise,  plus  HELOS  in  quiet  and  in 
noise). 

Therefore,  the  experimental  design  for  this  study  consisted  of  four  completely 
crossed  experimental  factors:  two  within-subject  factors  (HELOS  and  noise  condition) 
and  two  between-subject  factors  (threshold  and  distortion).  Repeated  measures  analysis 
of  variance  (ANOVA)  was  used  to  compare  mean  test  scores  among  the  various 
combinations  of  factor  levels.  This  statistical  tool  tested  for  the  presence  of  significant 
main  effects  and  2- way,  3 -way,  and  4-way  interactions  among  factors.  Tukey’s  multiple 
comparison  procedure  was  used  to  maintain  significance  level  of  p<.05  when  comparing 
factor  pair  combinations,  for  example,  TlDl  and  T1D2,  — HN  and  +HN. 

A Pearson  correlation  of  the  arcsine  square  root  means  data  was  used  to  determine  if  there 
was  a significant  (p<.05)  relationship  between  performance  scores  for  each  type  of  speech 


task  within  each  listening  condition. 
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Table  1.  Criteria  for  subject  selection  and  the  characteristics  of  experimental  subjects. 


CHARACTERISTIC 

CRITERIA 

MEAN 

Number 

60 

6 groups  of  10 

55  female 
5 male 

Age  range 

18-30  years  old 

21  yrs. 
(18-28) 

Speech  Reception 
Threshold  (SRT) 

Oto  lOdBHL 
in  quiet 

RE  2dB  (35) 
LE  2 dB  (25) 

Pure  tone  threshold 

0 - 20  dB  HL 
@ all  frequencies 

PTA  RE  3 7dB 
512  LE  2.7  dB 

PTA  RE  3 2dB 
124  LE  2.2  dB 

Word  recognition 
(MD  CNC  lists) 

94  - 100% 
in  quiet 

RE  99  % 

LE  99  % 

Synthetic  Sentence 
Inventory  (SSI) 

90  to  100% 

@ 80dB  HL  with 
0 dB  S/N  ratio 

ICM  97.5  % 
CCM  99, 1 % 

Tympanograms 

“Pass” 

Acoustic  reflexes 
(ipsi) 

Present  @ 95dB  HL 
at  Ik  «&  21c  Hz 

History 

(self  report) 

No  recurrent  ear  disease  or 
PE  tubes 
No  listening  or 
reading  problems 
Native  speaker  of  English 
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Figure  1.  Experimental  design  according  to  HELOS  settings  and  listening  conditions. 


Six  Experimental  Conditions 


D1  D2  D3 


Two  threshold  settings 
Three  distortion  settings 
TlDl  T2D1 
T1D2  T2D2 
T1D3  T2D3 


Four  Listenine  Conditions  Per  Subiect 

Without  HELOS  in  quiet 

-HQ 

Without  HELOS  in  noise 

-HN 

With  HELOS  in  quiet 

+HQ 

With  HELOS  in  noise 

+HN 

TlDl 

0 

N 

T2D1 

Q 

N 

-H 

-H 

+H 

+H 

T1D2 

Q 

N 

T2D2 

Q 

N 

-H 

-H 

+H 

+H 

T1D3 

Q 

N 

T2D3 

Q 

N 

-H 

-H 

+H 

+H 

CHAPTER  4 
RESULTS 

Electroacoustic  Analysis  Data 

Measures  were  made  of  frequency  bandwidth  (BW)  and  amplitude  in  the  frequency 
domain,  and  periodicity  in  the  time  domain  for  each  HELOS  experimental  setting.  The 
Tucker  Davis  waveform  generator  was  set  to  produce  sine  waves  for  500  Hz,  1000  Hz, 
2000  Hz  and  4000  Hz  with  a voltage  of  1 .0  volt.  This  voltage  created  an  approximate 
output  through  TDH-50  earphones  of  95  dB  SPL  for  500  Hz,  97  dB  SPL  for  1000  Hz,  96 
dB  SPL  for  2000  Hz  and  91  dB  SPL  for  4000  Hz..  The  sampling  rate  was  10,000  Hz  and 
a buffer  size  of  256  points  was  used.  Although  higher  buffer  sizes  were  attempted  to 
increase  digital  resolution,  the  graphic  clarity  and  domain  (x-axis)  values  became 
inappropriate.  Bandwidth  data  were  complete  for  Q-30  at  all  HELOS  settings  and  input 
frequencies.  Due  to  the  buffer  restriction  increasing  the  variability,  limited  information 
was  available  for  Q-10  and  Q-20  measurements. 

Bandwidth  values  for  Q-30  measured  for  each  HELOS  setting  can  be  seen  in 
Table  2 and  in  Figure  2.  The  TlDl  output  is  similar  to  the  control  pure  tone  Q-30  BW, 
indicating  that  BW  distortion  is  not  introduced  at  this  lowest  HELOS  setting.  This 
similarity  is  seen  also  when  only  threshold  is  changed  in  the  T2D 1 setting  except  for  2000 
Hz  which  is  slightly  wider  (control,  156  Hz;  T2D1,  195  Hz).  For  all  other  HELOS 
settings,  BW  increases  in  a monotonic  manner  with  the  distortion  setting  for  all 
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frequencies,  particularly  2000  Hz  and  4000  Hz.  It  is  important  to  note  that  BW  was 
greater  for  the  T1D3  settings  than  for  the  T2D3  settings.  The  only  exception  is  for  1000 
Hz.  These  data  may  indicate  that  attenuation  of  the  signal  was  decreasing  the  frequency 
distortion  in  the  upper  frequencies. 

Pure  tone  signals  were  used  to  measure  the  root-mean-square  amplitude  of  the 
HELOS’  output  for  each  experimental  setting.  The  waveform  generator  of  the  Tucker 
Davis  II  system  routed  the  signal  through  the  audiometer  to  the  HELOS  and  back  to  the 
spectral  analyzer.  Measurements  were  taken  at  each  frequency  behavioral  thresholds  were 
measured  with  the  audiometer  attenuator  set  on  70  dB  HL  (typical  calibration  level)  and 
56  dB  HL  (speech  stimuli  presentation  level).  Line  interference  was  minimal  at  .0004  volt. 
Results  indicated  a consistent  U-shape  output  with  the  greatest  voltage  at  250  Hz  falling 
to  a minimum  at  1000  Hz  and  then  rising  to  an  output  at  8000  Etz  approximately  . 150  to 
.200  volts  less  than  at  250  Hz.  Although  absolute  voltage  decreased  as  thresholds  and 
distortion  settings  increased,  output  pattern  was  consistent  across  frequencies  and  across 
all  HELOS  experimental  settings. 

In  the  time  domain,  measurements  were  taken  at  3,  6,  12  and  18  ms.  These  times 
were  chosen  to  be  similar  to  the  HELOS  description  reported  by  Gagne  and  Erber  (1987). 
Again  the  buffer  point  size  of  256  provided  the  best  graphic  clarity  but  did  increase 
variability  in  measurement  points.  As  the  distortion  dial  was  rotated,  visual  observation  of 
a constant  signal  input  indicated  a randomly  appearing  distortion  of  period  and  amplitude 
within  individual  cycles.  Also,  it  was  observed  with  the  increase  in  distortion  setting  that 
the  visual  shape  of  an  individual  sine  wave  would  become  increasingly  misshapen  and 
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involve  adjacent  cycles.  For  the  experimental  HELOS  settings,  periodicity  changes  were 
measured  relative  to  the  control  pure  tone’s  period  Data  in  Table  3 show  that  the 
increase  of  change  in  period  occurred  most  often  for  HELOS  settings  involving  D2  and 
D3  distortion  settings,  especially  for  1000  Hz. 

In  summary,  electroacoustic  data  collected  indicated  a monotonic  increase  in 
bandwidth  as  distortion  settings  increased,  with  the  greatest  increase  at  2000  Hz  and  4000 
Hz.  However,  there  appeared  an  interaction  when  threshold  attenuation  was  employed 
because  BW  decreased  in  the  upper  frequencies.  The  signal  periodicity  appeared  to  change 
randomly  over  time  when  a distortion  setting  increased,  particularly  for  the  mid-frequency 
1000  Hz. 

Behavioral  Threshold  Data 

As  previously  discussed,  this  investigation  was  limited  to  the  HELOS  slope  button 
setting  for  a “flat”  audiometric  configuration.  Raw  data  for  subjects’  pure  tone  and 
speech  (SRT)  thresholds  can  be  found  in  Appendix  D.  Threshold  data  for  both  ears  were 
collapsed  to  figure  mean  values  because  of  the  strict  subject  threshold  criteria  and  HELOS 
thresholds  were  obtained  only  for  the  experimental  ear.  The  mean  pure  tone  threshold 
values,  as  seen  in  Figure  3,  clearly  indicate  that  the  audiometric  configuration  created  by 
the  HELOS  was  not  flat.  Instead,  the  configuration  is  a skewed  U-shaped  threshold  loss, 
with  the  greatest  attenuation  at  1000  Hz.  Across  HELOS  settings,  rising  threshold  means 
represent  an  average  change  of  5 dB  per  octave  above  1000  Hz.  Setting  TlDl  is  similar 
to  the  control  condition  which  is  consistent  with  the  electroacoustic  output  data.  The 
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greatest  changes  in  thresholds  were  for  setting  T2D 1 with  the  maximum  mean  threshold  of 
46.6  dB  HL  at  1000  Hz. 

The  mean  values  for  spondee  SRTs  ranged  from  3.7  dB  HL  for  TlDl  to  42. 1 for 
T2D1 . These  results  are  consistent  with  pure  tone  thresholds,  in  particular  within  2 dB  of 
the  mean  thresholds  for  500  Hz. 

Recognition  of  Speech  Materials 

Descriptive  data 

Performance  data  were  analyzed  separately  for  each  type  of  speech  stimuli  under 
each  listening  condition  within  each  experimental  condition  (see  Apendix  E for  subjects’ 
raw  data).  Figures  4,  5 and  6 graphically  show  the  group  means  for  each  speech  material. 
Complete  descriptive  data  can  be  seen  in  Tables  4,  5,  and  6. 

Group  means  for  monosyllabic  words  (Figure  4 and  Table  4)  were  stable  across  all 
subjects  in  the  control  listening  conditions,  -HQ  and  -HN,  indicating  homogeneity  of 
scores.  In  -HQ,  the  means  ranged  from  94%  to  98.6%  with  a standard  deviation  (S.D.) 
range  of  1.64  to  10. 1 1%.  Only  condition  T2D1  had  an  S.D.  greater  than  2.75  because  of 
one  subject’s  unusually  poor  score.  For  -HN,  scores  were  uniformly  lower  ranging  from 
53%  to  63.2%  with  an  S.D.  range  of  3.43  to  9.65. 

For  the  HELOS  conditions,  means  generally  decreased  as  threshold  and  distortion 
increased,  ranging  from  64.2%  to  96.8%  (S.D.  4.29  to  12.73)  in  quiet  and  from  33.2%  to 
5 1.2  % (S.D.  7.84  to  14.47)  in  noise.  Variability  in  scores  were  greater  in  noise  except 
for  the  higher  distortion  conditions,  e g.,  T1D3,  T2D2,  T2D3. 
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Group  means  for  nonsense  syllables  (Figure  5 and  Table  5)  were  stable  across 
control  listening  conditions,  ranging  from  94.44%  to  98.3%  with  an  S.D.  range  of  1. 1 1 to 
3.41  in  quiet  and  86.38%  to  89.95%  with  an  S.D.  range  of  3.68  to  12.68  in  noise. 

Mean  performances  for  the  HELOS  conditions  in  quiet  ranged  from  84. 1%  to 
97.2%  (S.D.  2.57  to  7.95)  and  in  noise  ranged  from  75.90  to  87. 13  (S.D.  3.38  to  9. 15). 
Variability  in  scores  showed  the  greatest  increase  with  the  highest  distortion  levels  (T1D3, 
T2D3)  when  noise  was  added. 

For  the  CST  passages,  mean  scores  were  very  similar  in  quiet  regardless  of 
HELOS  setting  (Figure  6 and  Table  6).  Group  means  for  the  control  listening  condition  in 
quiet  ranged  from  96.2%  to  99.7%  with  a S.D.  range  of  0.48  to  6.56,  respectively.  In 
noise,  means  were  lower  ranging  from  32.3%  to  90.5%  with  an  S.D.  range  of  4.90  to 
7.90. 

Mean  recognition  for  passages  under  HELOS  conditions  in  quiet  were  similar  to 
control  condition  in  quiet  with  scores  ranging  from  97.4%  to  98.7%  (S.D.  1.71  to  2.26). 
However,  in  noise  conditions,  scores  varied  from  66.8%  to  85.2%  with  a S.D.  range  of 
5.97  to  9. 1 1 . The  lowest  mean  score  and  highest  variability  was  for  the  T2D3  setting. 

In  summary,  mean  recognition  scores  decreased  and  variability  increased  as  threshold  and 
distortion  settings  increased  for  all  speech  materials.  In  quiet  control  conditions,  similarity 
of  scores  indicated  homogeneity  of  listeners’  abilities.  In  quiet  +HELOS  conditions, 
scores  monotonically  decreased  with  threshold  attenuation  and  increased  distortion 
setting.  For  monosyllabic  words  and  nonsense  syllables,  recognition  in  noise  (+HN)  was 
similar  with  the  control  listening  condition  (-HN)  for  settings  TlDl,  T1D2  and  T2D1. 
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However,  when  combined  T and  D settings  became  more  extreme  (T1D3,  T2D3,  and 
T2D3),  recognition  means  were  worse  under  +HELOS  noise  conditions  than  control  noise 
(-HN)  conditions.  These  data  would  indicate  that  for  these  subjects,  recognition  in  noise 
was  not  affected  by  threshold  attenuation  alone  but  only  in  combination  with  other 
distortions  of  the  signal.  Similarly,  mean  performance  for  the  connected  speech  passages 
indicated  little  difference  in  recognition  in  noise  between  control  or  TD  for  TlDl,  T1D2 
and  T2D2.  As  threshold  attenuation  occurs  alone  (T2D1),  distortion  occurs  alone  (T1D3) 
and  the  extreme  TD  combination  occurs  (T2D3),  the  control  noise  condition  (-HN)  is 
similar  to  the  HELOS  quiet  condition  (+HQ).  However,  mean  recognition  scores  are 
much  poorer  for  these  TDs  in  noise. 

Analysis  of  Variance 

A repeated  measures  analysis  of  variance  (ANOVA)  was  used  to  compare  mean 
test  scores  to  determine  if  there  were  any  significant  (p<.05)  main  effects  and  interactions 
between  the  experimental  variables.  This  study  consisted  of  four  completely  crossed 
experimental  factors  to  each  of  six  groups  of  ten  subjects.  The  two  within-subject  factors 
were  absence  or  presence  of  HELOS  simulation  (-H/+H)  and  absence  or  presence  of  noise 
(Q,  N).  The  two  between-subject  factors  were  two  levels  of  threshold  (Tl,  T2)  and  three 
levels  of  distortion  (Dl,  D2,  D3).  Therefore,  the  design  of  the  ANOVA,  as  diagrammed 
in  Figure  1,  is  subjects  (N=10  x 6 groups)  in  two  thresholds  (T1,T2)  in  three  distortions 
(Dl,  D2,  D3)  by  HELOS  condition  (-H,  +H)  by  noise  condition  (Q,  N).  Because  test 
scores  were  expressed  as  percentages  and  often  fell  outside  a range  of  30%  to  70%, 
individual  scores  were  transformed  via  the  arcsine-square  root  transformation  prior  to 
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analysis  in  order  to  stabilize  variances.  See  Table  7 for  the  ANOVA  findings  for  main 
effects  and  two-,  three-  and  four-way  interactions.  In  order  to  compare  pairs  of  individual 
cell  means  to  determine  significant  (p<.05)  differences,  the  Tukey  Wholly  Significant 
Difference  (WSD)  procedure  was  used  for  multiple  comparisons.  Findings  will  be 
presented  according  to  each  type  of  speech  material. 

Monosyllable  words  (NU-b) 

For  the  NU-6  monosyllable  words,  significant  (p<.05)  three-way  interactions  were 
found  for  HELOS  x noise  x distortion  and  for  HELOS  x threshold  x distortion.  Due  to 
these  interactions,  the  main  effects  and  two-way  interactions  will  not  be  considered. 

In  order  to  determine  which  specific  conditions  were  interacting,  pairwise 
comparisons,  using  the  WSD  test,  were  completed  for  between-subject  and  within  subject 
factors.  Comparison  for  between-subject  factors  for  each  TD  setting  are  shown  on 
Table  8 and  plotted  on  Figure  7.  No  significant  (p<.05)  differences  were  found  for  the  TD 
conditions  without  the  HELOS  (-HQ  and  -HN).  This  finding  validates  the  homogeneity  of 
the  subject  sample.  Under  the  HELOS  conditions  in  quiet  (+HQ),  there  was  a significant 
(p<.05)  difference  in  mean  performance  with  threshold  attenuation.  The  only  significant 
(p<.05)  difference  for  distortion  is  the  extreme  setting  with  no  threshold  attenuation 
(T1D3).  However,  there  were  no  significant  (p<.05)  differences  with  changes  between 
combined  threshold  and  distortion  settings  (T2D1,  T2D2  and  T2D3).  When  subjects 
listened  in  noise  with  the  HELOS  (+HN),  the  only  significant  (p<.05)  differences  were 
found  between  T1D2  and  T2D3  and  between  T2D1  and  T2D3.  Therefore,  there  was 
some  indication  that  threshold  attenuation  and  distortion  each  had  effect  on  performance. 
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Table  9 and  Figure  8 shows  pairwise  comparisons  for  within-subject  factors: 

-/+  HELOS  in  quiet  and  -/+HELOS  in  noise.  It  can  be  seen  by  the  groupings  and 
intersecting  lines  that  there  is  a significant  difference  between  quiet  and  noise  conditions, 
regardless  of  TD  setting.  Except  for  T1D2  and  T2D1,  the  most  severe  listening  condition, 
+HN,  was  significantly  different  from  all  other  listening  conditions.  For  those  two 
exceptions,  +HN  performance  was  similar  to  -HN  performance.  The  two  groups  who  had 
significant  (p<.05)  differences  between  all  listening  conditions  were  settings  T1D3  and 
T2D3  which  are  the  settings  with  the  highest  distortion. 

Nonsense  syllables  (IMCT) 

Significant  (p<.05)  three-way  interactions  were  found  between  experimental 
factors  for  nonsense  syllable  recognition  (see  Table  7)  for  HELOS  x threshold  x 
distortion,  for  noise  x threshold  x distortion  and  for  HELOS  x noise  x threshold.  The 
presence  of  these  interactions  precludes  consideration  of  main  effects  and  two-way 
interactions. 

Pairwise  WSD  comparisons,  as  shown  on  Table  10  and  plotted  on  Figure  9, 
indicate  which  factors  (TD  settings)  are  involved  in  the  significant  interactions.  There 
were  no  significant  (p<  05)  differences  in  any  of  the  conditions  without  the  HELOS  except 
for  TlDl  and  T2D1  in  quiet  (-HQ).  This  finding  indicated  a statistically  significant 
(p<  05)  difference  with  a change  of  threshold  in  quiet.  However,  because  this  is  the  -H 
condtion,  this  difference  reflects  subject  variability  according  to  the  small  critical 
difference  (.0928). 
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For  the  TD  conditions  in  quiet  (+HQ),  there  were  significant  (p<  05)  differences 
when  only  distortion  was  highest  (T1D3)  and  when  threshold  changed  across  all  distortion 
settings  (T2D1,  T2D2,  T2D3).  However,  changes  in  distortion  at  all  T2  levels  resulted  in 
no  significant  (p<  05)  differences  in  group  means. 

For  the  TD  settings  in  noise  (+HN),  there  were  significant  (p<.05)  differences 
between  the  extreme  distortion  settings  within  a threshold  setting  (e.g.  TlDl  and  T1D3, 
T2D1  and  T2D3)  and  between  the  two  most  different  TD  settings  overall,  TlDl  and 
T2D3.  These  findings  indicate  significant  (p<.05)  differences  in  recognition  of  the  IMCT 
nonsense  syllables  when  noise  is  present  and  both  types  of  distortions  are  highest. 

The  pairwise  comparisons  for  the  with-in  subject  factors  (-/+HQ  and  -/+HN)  is 
shown  onTable  1 1 and  plotted  on  Figure  10.  Group  means  were  always  significantly 
different  when  noise  was  present,  regardless  of  HELOS  use  or  setting.  However,  there 
were  no  significant  (p<.05)  differences  between  -HN  and  +HN  group  means  except  in  the 
highest  distortion  settings,  T1D3  and  T2D3.  It  is  also  interesting  to  note  that  in  the 
highest  distortion  only  setting,  (T1D3),  and  all  the  T2  settings  (regardless  of  distortion 
setting),  that  there  were  no  significant  (p<  05)  differences  between  -HN  and  +HQ  means. 
This  finding  indicates  that  speech  recognition  in  noise  with  no  hearing  loss  simulation  was 
similar  to  recognition  in  quiet  under  simulated  hearing  loss  conditions. 

Connected  speech  passages  (CST) 

Significant  (p<.05)  two-way  interactions  were  found  for  HELOS  x noise  and  for 
HELOS  X distortion  (See  Table  7).  The  factor  of  threshold  was  a significant  (p<.05)  main 
effect  in  the  recognition  of  passages.  The  mean  for  the  T1  condition  (91 . 1%)  was 
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significantly  greater  than  the  mean  for  the  T2  condition  (88. 1%)  across  all  other 
conditions.  Due  to  interactions,  all  other  main  effects  were  not  considered. 

For  the  HELOS  x noise  interaction,  no  significant  (p<.05)  difference  was  found 
between  the  -HQ  and  +HQ  conditions.  However,  there  were  significant  (p<.05) 
differences  between  quiet  and  noise  conditions  across  the  HELOS  conditions  and  between 
-HN  and  +HN  conditions.  See  Table  13  and  Figure  12a  for  these  results.  Table  12  and 
Figure  1 1 contain  the  full  comparison  data. 

For  the  two-way  interaction  for  HELOS  x distortion  pairwise  Tukey  WSD 
comparisions  found  no  statistically  significant  (p<.05)  differences  between  the  three  levels 
of  distortion  across  both  HELOS  conditions  (See  Table  15  ).  However,  the  comparison 
for  D3  is  close  to  significance  (.066,  critical  difference  =.086).  It  should  be  noted  that  a 
very  conservative  degrees  of  freedom  were  used  to  rduce  Type  I error  in  rejecting  the  null 
hypothesis.  Therefore,  to  explain  the  interaction  finding,  means  were  plotted  graphically, 
as  seen  in  Figure  12b.  Clearly,  interaction  is  occuring  as  the  lines  approach  and  diverge 
from  one  another  Full  comparison  data  is  shown  on  Table  14. 

Correlations  of  Speech  Materials 

Significant  (p<  05)  correlations  were  found  between  all  three  speech  stimuli  for 
quiet  and  noise  conditions.  See  Table  16  for  Pearson  Product  moment  correlation  r- 
values  found  for  each  listening  condition  (-/+HQ  and  -/+HN).  In  the  control  conditions 
(-H),  monosyllabic  words  were  significantly  correlated  with  nonsense  syllables  in  quiet, 
and  significantly  correlated  with  the  passages  in  noise.  However  in  quiet,  neither 
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monosyllable  words  or  nonsense  syllables  were  significantly  correlated  with  the  passages, 
nor  were  nonsense  syllables  significantly  correlated  with  passages  in  noise. 

In  the  simulated  hearing  loss  conditions  (+HQ  and  +HN),  there  were  significant 
relationships  between  all  three  speech  materials  both  in  quiet  and  noise  listening 
conditions.  It  is  important  to  note  that  although  r-values  are  below  50%,  the  individual 
p-values  are  highly  significant. 
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Table  2.  Q-30  bandwidth  (BW)  changes  for  pure  tone  signals  transduced  by  the  HELOS. 
Differences  between  control  and  experimental  BWs  are  shown. 


500  Hz 

1000  Hz 

2000  Hz 

4000  Hz 

BW 

Differ 

BW 

Differ 

BW 

Differ 

BW 

Differ 

CONTROL 

98 

— 

200 

— 

156 

— 

313 

— 

TlDl 

100 

2 

195 

-5 

156 

0 

313 

0 

T1D2 

117 

19 

235 

35 

391 

235 

2969 

2656 

T1D3 

410 

312 

547 

347 

2383 

2148 

4141 

3828 

T2D1 

97 

-1 

194 

-6 

195 

39 

312 

-1 

T2D2 

195 

97 

508 

308 

743 

587 

1597 

1284 

T2D3 

332 

234 

704 

504 

1641 

1485 

3125 

2812 

Bandwidth  in  Hertz 

500  1000  1500  2000  2500  3000  3500  4000  4500 


69 


N 

N 

X 

N 

X 

N 

X 

X 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

IT) 

(N 

□ 

B 

□ 

CM 

CO 

r" 

CM 

CO 

o 

Q 

O 

a 

Q 

Q 

Q 

cc 

1 — 

CM 

CM 

CM 

I- 

t- 

1- 

1- 

1- 

H 

O 

u 


eujuss  letuaiuuadxg 


Figure  2.  The  Q-30  bandwidth  of  experimental  HELOS  settings  using  pure  tone  inputs. 
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Table  3.  Periodicity  differences  of  pure  tone  input  signal  with  and  without  the  HELOS 
at  specific  time  points. 


Input 

Tone 

Measure 

Points 

HELO  SE  TINGS 

TlDl 

T1D2 

T1D3 

T2D1 

T2D2 

T3D3 

500  HZ 

Sms 

0 

-.10 

-.10 

0 

-.10 

-.10 

6ms 

0 

0 

0 

0 

0 

-.10 

12ms 

0 

-.10 

0 

0 

0 

-.10 

18ms 

0 

-.10 

0 

0 

0 

-.10 

1000  Hz 

Sms 

0 

0 

+.05 

0 

0 

+.10 

6ms 

0 

-.05 

+.10 

0 

-.05 

0 

12ms 

0 

-.05 

-.15 

0 

0 

+.05 

18ms 

0 

+.05 

+.05 

0 

0 

-.05 

2000  Hz 

Sms 

0 

0 

0 

0 

0 

0 

6ms 

0 

0 

0 

0 

-.10 

-.20 

12ms 

0 

0 

+.10 

0 

0 

+.10 

18ms 

0 

0 

0 

0 

0 

0 

4000  Hz 

Sms 

0 

+.10 

0 

0 

0 

0 

6ms 

0 

0 

0 

0 

0 

0 

12ms 

0 

0 

0 

0 

-.05 

+.10 

18ms 

CNE 

CNE 

CNE 

CNE 

CNE 

CNE 

CNE:  could  not  evaluate 


Frequency  in  Hz 
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according  to  experimental  HELOS  settings. 
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TlDl  T1D2  T1D3  T2D1  T2D2  T2D3 

Experimental  Condition 

Figure  4.  Group  mean  scores  for  recogntion  of  monosyllabic  words  (NU-6)  according  to  experimental  HELOS  settings 
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Table  4.  Raw  untransformed  scores  for  monosyllabic  words  (NU-6). 


Experimental 

Condition 

Listening 

Condition 

Mean 

Standard 

Deviation 

Median 

Score 

Minimum 

Score 

Maximum 

Score 

TlDl 

-HQ 

98,4 

2.06 

99 

94 

100 

-HN 

63,2 

6.26 

64 

54 

72 

+HQ 

96,8 

4.82 

98 

84 

100 

+HN 

47,8 

14.47 

51 

12 

60 

T1D2 

-HQ 

97,6 

2.95 

99 

92 

100 

-HN 

54,4 

9.65 

58 

40 

70 

+HQ 

94,4 

4.29 

95 

86 

100 

+HN 

51,2 

11.74 

49 

34 

78 

T1D3 

-HQ 

98,6 

1.64 

99 

96 

100 

-HN 

59,0 

8.34 

59 

46 

70 

+HQ 

88,4 

7.98 

90 

72 

100 

+HN 

40.6 

8.43 

43 

20 

50 

T2D1 

-HQ 

94.0 

10.11 

97 

66 

100 

-HN 

54.6 

6.32 

54 

44 

64 

+HQ 

73.8 

7.74 

74 

64 

86 

+HN 

48.4 

12.10 

47 

36 

78 

T2D2 

-HQ 

98.6 

2.11 

100 

94 

100 

-HN 

63.2 

8.85 

63 

46 

76 

+HQ 

64.2 

12,73 

67 

32 

76 

+HN 

43.0 

7.84 

44 

26 

52 

T2D3 

-HQ 

97.8 

1.75 

98 

96 

100 

-HN 

53.0 

3.43 

53 

48 

58 

+HQ 

66.8 

9.39 

67 

48 

80 

+HN 

33.2 

8,70 

34 

20 

48 

74 
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Figure  5.  Group  mean  scores  for  recognition  of  nonsense  syllables  (IMCT)  according  to  experimental  HELOS  settings. 
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Table  5.  Raw  untransformed  scores  for  nonsense  VCV  syllables  (IMCT). 


Experimental 

Condition 

Listening 

Condition 

Mean 

Standard 

Deviation 

Median 

Score 

Minimum 

Score 

Maximum 

Score 

TlDl 

-HQ 

98.30 

1.11 

98.4 

95.8 

100.0 

-HN 

89.32 

4.83 

90.5 

79.1 

93.7 

+HQ 

97.15 

2.57 

97.3 

93.7 

100.0 

+HN 

87.13 

3.38 

86.4 

82.2 

92.7 

T1D2 

-HQ 

95.48 

2.61 

96.3 

91.6 

97.9 

-HN 

89.95 

4.38 

90.0 

84.3 

97.9 

+HQ 

95.80 

2.95 

96.3 

91.6 

100.0 

+HN 

86.94 

5.44 

86.4 

78.1 

94.7 

T1D3 

-HQ 

95.98 

2.36 

94.7 

93.7 

100.0 

-HN 

89.43 

3.68 

90.0 

83.3 

94.7 

+HQ 

89.11 

4.58 

89.0 

82.2 

95.8 

+HN 

75.89 

9.15 

75.5 

59.3 

86.4 

T2D1 

-HQ 

94.44 

3.31 

94.2 

89.5 

97.9 

-HN 

89.12 

4.98 

90.6 

80.2 

95.8 

+HQ 

84.13 

7.95 

84.8 

71.8 

94.7 

+HN 

78.60 

7.84 

79.6 

66.6 

91.6 

T2D2 

-HQ 

97.66 

1.46 

97.3 

95.8 

100.0 

-HN 

86.38 

12.68 

89.5 

53.0 

96.8 

+HQ 

88.19 

7.46 

85.9 

76.0 

97.9 

+HN 

83.39 

5.02 

82.7 

75.0 

91.6 

T2D3 

-HQ 

95.82 

3.41 

96.3 

89.9 

100.0 

-HN 

87.87 

4.92 

88.5 

80.2 

96.8 

+HQ 

87.14 

3.53 

86.9 

81.2 

94.7 

+HN 

77.66 

7.59 

79.6 

59.3 

85.4 

76 
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T1D1  T1D2  T1D3  T2D1  T2D2  T2D3 

Experimental  Condition 

Figure  6,  Group  mean  scores  for  recognition  of  connected  speech  passages  (CST)  according  to  experimental  HELOS  settings 
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Table  6.  Raw  untransformed  scores  for  passages  (CST). 


Experimental 

Condition 

Listening 

Condition 

Mean 

Standard 

Deviation 

Median 

Score 

Minimum 

Score 

Maximum 

Score 

TlDl 

-HQ 

99.6 

.51 

100.0 

99 

100 

-HN 

87.6 

8.48 

88.5 

74 

100 

+HQ 

98.7 

2.26 

100.0 

94 

100 

+HN 

85.2 

5.97 

86.0 

73 

95 

T1D2 

-HQ 

99.1 

1.10 

99.5 

97 

100 

-HN 

87.7 

5.81 

88.5 

79 

96 

+HQ 

98.4 

1.71 

99.0 

94 

100 

+HN 

80.4 

6.13 

79.5 

69 

88 

T1D3 

-HQ 

99.7 

.48 

100.0 

99 

100 

-HN 

90.5 

4.90 

90.0 

83 

97 

+HQ 

98.5 

.84 

98.5 

97 

100 

+HN 

68.3 

13.24 

71.0 

40 

84 

T2D1 

-HQ 

96.2 

6.56 

98.0 

78 

100 

-HN 

82.3 

7.90 

84.5 

70 

94 

+HQ 

97.7 

2.26 

98.5 

93 

100 

+HN 

72.2 

9.18 

76.0 

53 

83 

T2D2 

-HQ 

99.30 

.82 

99.5 

98 

100 

-HN 

86.9 

6.36 

88.5 

75 

95 

+HQ 

97.6 

2.36 

98.0 

93 

100 

+HN 

75.4 

7.36 

74.5 

66 

86 

T2D3 

-HQ 

99.5 

.70 

100.0 

98 

100 

-HN 

86.5 

9.41 

87.0 

66 

98 

+HQ 

97.4 

1.71 

97.0 

94 

100 

+HN 

66.8 

9.11 

69.0 

52 

77 

78 


Table  7.  Full  model  F-test  p-values  for  Analysis  of  Variance  (ANOVA).  There  were  two 
between  -subject  variables  (threshold  and  distortion)  and  two  within- subject  variables 
(HELOS  and  noise).  Asterisks  (*)  indicate  those  conditions  which  were  significant  with 
a p<.05. 


Experimental 

Condition 

Monosyllabic 

Words 

NU-6 

Nonsense 

Syllables 

IMCT 

Passages 

CST 

Threshold 

.0001* 

.0002* 

.0025* 

Distortion 

.0118* 

.0096* 

.5900 

Threshold  x Distortion 

.4937 

.0087* 

.0746 

HELOS 

.0001* 

.0001* 

.0001* 

HELOS  X Threshold 

.0001* 

.0008* 

.5252 

FIELOS  X Distortion 

.0004* 

.0045* 

.0001* 

HELOS  X Threshold  x Distortion 

.0011* 

.0396* 

.6665 

Noise 

.0001* 

.0001* 

.0001* 

Noise  X Threshold 

.0001* 

.0435* 

.2042 

Noise  X Distortion 

.4070 

.9595 

.0932 

Noise  X Threshold  x Distortion 

.0964 

.0296* 

.8484 

HELOS  X Noise 

.0001* 

.5229 

.0001* 

HELOS  X Noise  x Threshold 

.0001* 

.0053* 

.4875 

HELOS  X Noise  x Distortion 

.1645 

.7664 

.0952 

HELOS  X Noise  x 
Threshold  x Distortion 

.4985 

.2256 

.2339 
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Table  8.  Monosyllable  words  (NU-6):  Pairwise  Tukey  WSD  comparison  of  individual 
cell  means  within  each  TD  experimental  setting  by  HELOS  and  noise  subgroup.  Those 
pairs  found  significantly  (p<.05)  different  are  indicated  by  different  group  numbers. 
Critical  difference  equals  0. 1485. 


Listening 

Condition 

Experimental 

Condition 

Mean 

Transformed 

Mean 

Grouping 

-HQ 

TlDl 

98.4 

1.483 

1 

T1D2 

97.6 

1.462 

1 

T1D3 

98.6 

1.487 

1 

T2D1 

94.0 

1,376 

1 

T2D2 

98.6 

1,497 

1 

T2D3 

97.8 

1,447 

1 

-HN 

TlDl 

63.2 

.920 

1 

T1D2 

54.4 

.830 

1 

T1D3 

59.0 

.877 

1 

T2D1 

54.6 

.831 

1 

T2D2 

63.2 

,921 

1 

T2D3 

53,0 

.815 

1 

+HQ 

TlDl 

96.8 

1.440 

1 

T1D2 

94.4 

1.361 

1 2 

T1D3 

88.4 

1.249 

2 

T2D1 

73.8 

1.037 

3 

T2D2 

64.2 

.931 

3 

T2D3 

66.8 

.959 

3 

+HN 

TlDl 

47.6 

.756 

1 2 

T1D2 

51,2 

.798 

1 

T1D3 

40.6 

.688 

1 2 

T2D1 

48.4 

.770 

1 

T2D2 

43,0 

.714 

1 2 

T2D3 

33.2 

.611 

2 

80 


Figure  7.  Group  means  of  NU-6  monosyllabic  words  for  each  ID  experimental  setting 
re.  noise  condition.  The  “T-Score”  is  the  transformed  mean  score  derived  from  an 
arcsine  square  root  procedure. 


NU-6 

T1D1  T2D1 


NOISE  CONDITION  NOISE  CONDITION 
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NOISE  CONDITION 
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Table  9.  Monosyllable  words  (NU-6);  Pairwise  Tukey  WSD  comparison  of  individual 
cell  means  within  each  experimental  TD  setting  by  noise  subgroup.  Those  pairs  found 
significantly  different  (p<.05)  are  indicated  by  different  group  numbers.  Critical  differnce 
equals  0. 1335. 


Experimental 

Condition 

Listening 

Condition 

Mean 

Transformed 

Mean 

Grouping 

TlDl 

-HO 

98.4 

1.483 

1 

+HQ 

96.8 

1.440 

1 

-HN 

63.2 

.920 

2 

+HN 

47.6 

.756 

3 

T1D2 

-HQ 

97.6 

1.462 

1 

+HQ 

94.4 

1.361 

1 

-HN 

54.4 

.830 

2 

+HN 

51.2 

.798 

2 

T1D3 

-HQ 

98.6 

1.487 

1 

+HQ 

88.4 

1.249 

2 

-HN 

59.0 

.877 

3 

+HN 

40.6 

.688 

4 

T2D1 

-HQ 

94.0 

1.376 

1 

+HQ 

73.8 

1.037 

2 

-HN 

54.6 

.831 

3 

+HN 

48.4 

.770 

3 

T2D2 

-HQ 

98.6 

1.497 

1 

+HQ 

64.2 

.931 

2 

-HN 

63.2 

.921 

2 

+HN 

43.0 

.714 

3 

T2D3 

-HQ 

97.8 

1.447 

1 

+HQ 

66.8 

.959 

2 

-HN 

53.0 

.815 

3 

+HN 

33.2 

.611 

4 

T-SCORE  T-SCORE 


Figure  8.  Group  means  of  NU-6  monosyllabic  words  for  each  HELOS  and  noise 
condition  re:  threshold  setting.  The  “T-Score”  is  the  transformed  mean  score  derived 
from  an  arcsine-square  root  procedure. 
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-HQ 


-HN 


THRESHOLD  THRESHOLD 


+HQ 


+HN 
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Table  10.  Nonsense  syllables  (IMCT):  Pairwise  Tukey  WSD  comparison  of  individual 
cell  means  within  each  TD  experimental  setting  by  HELOS  and  noise  subgroup.  Those 
pairs  found  significantly  different  (p<  05)  are  indicated  by  different  group  numbers. 
Critical  difference  equals  0.0928. 


Listening 

Condition 

Experimental 

Condition 

Mean 

Transformed 

Mean 

Grouping 

-HQ 

TlDl 

98.30 

1.449 

1 

T1D2 

95.48 

1.364 

1 2 

T1D3 

95.98 

1.385 

1 2 

T2D1 

94.44 

1.343 

2 

T2D2 

97.66 

1.429 

1 2 

T2D3 

95.82 

1.387 

1 2 

-HN 

TlDl 

89.32 

1.244 

1 

T1D2 

89.95 

1.256 

1 

T1D3 

89.43 

1.243 

1 

T2D1 

89.12 

1.242 

1 

T2D2 

86.38 

1.215 

1 

T2D3 

87.87 

1.222 

1 

+HQ 

TlDl 

97.15 

1.433 

1 

T1D2 

95.80 

1.383 

1 

T1D3 

89.11 

1.241 

2 

T2D1 

84.13 

1.171 

2 

T2D2 

88.19 

1.240 

2 

T2D3 

87.14 

1.207 

2 

+HN 

TlDl 

87.13 

1.206 

1 

T1D2 

86.94 

1.208 

1 

T1D3 

75.89 

1.063 

2 

T2D1 

86.94 

1.208 

1 

T2D2 

83.39 

1.155 

1 2 

T2D3 

77.66 

1.082 

2 

T-SCORE  T-SCORE  T-SCORE 


Figure  9,  Group  means  of  IMCT  nonsense  syllables  for  each  TD  experimental  setting 
re:  noise  condition.  The  “T-Score”  is  the  transformed  mean  score  derived  from  an 
arcsine  square  root  procedure. 
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Table  11.  Nonsense  syllables  (IMCT):  Pairwise  Tukey  WSD  comparisons  of  individual 
cell  means  within  each  TD  experimental  setting  by  noise  subgroup.  Those  pairs  found 
significantly  different  (p<.05)  are  indicated  by  different  group  numbers.  Critical  difference 
equals  0. 1335. 


Experimental 

Condition 

Listening 

Condition 

Mean 

Transformed 

Mean 

Grouping 

TlDl 

-HO 

98.30 

1.449 

1 

+HQ 

97.15 

1.433 

1 

-HN 

89.32 

1.244 

2 

+HN 

87.13 

1.206 

2 

T1D2 

-HO 

95.80 

1.383 

1 

+HO 

95.48 

1.364 

1 

-HN 

89.95 

1.256 

2 

+HN 

86.94 

1.208 

2 

T1D3 

-HO 

95.98 

1.385 

1 

+HO 

89.43 

1.243 

2 

-HN 

89.11 

1.241 

2 

+HN 

75.89 

1.063 

3 

T2D1 

-HO 

94.44 

1.343 

1 

+HO 

89.12 

1.242 

2 

-HN 

84.13 

1.171 

2 3 

+HN 

78.60 

1.096 

3 

T2D2 

-HO 

97.66 

1.429 

1 

+HO 

88.19 

1.240 

2 

-HN 

86.38 

1.215 

2 3 

+HN 

83.39 

1.155 

3 

T2D3 

-HO 

95.82 

1.387 

1 

+HO 

87.87 

1.222 

2 

-HN 

87.14 

1.207 

2 

+HN 

77.66 

1.082 

3 

T-SCORE  T-SCORE 


Figure  10.  Group  means  of  IMCT  nonsense  syllables  for  each  HELOS  and  noise 
conditions  re:  threshold  setting.  The  “T-Score”  is  the  transformed  mean  score  derived 
from  an  arcsine  square  root  procedure. 
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Table  12.  Connected  speech  passages  (C ST):  Pairwise  Tukey  WSD  comparison  of 
individual  cell  means  within  each  TD  experimental  setting  by  HELOS  and  noise  subgroup. 
Those  pairs  found  significantly  (p<.05)  different  are  indicated  by  different  group  numbers. 
Critical  difference  equals  0.1485. 


Listening 

Condition 

Experimental 

Condition 

Mean 

Transformed 

Mean 

Grouping 

-HO 

TlDl 

99.6 

1.530 

1 

T1D2 

99.1 

1.504 

1 2 

T1D3 

99.7 

1.540 

1 

T2D1 

96.2 

1.419 

2 

T2D2 

99.3 

1.512 

1 2 

T2D3 

99.5 

1.526 

1 

-HN 

TlDl 

87.6 

1.240 

1 2 

T1D2 

87.7 

1.221 

1 2 

T1D3 

90.5 

1.267 
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Figure  11.  Group  means  of  CST  passages  for  each  TD  experimental  setting  re:  noise 
condition.  The  “T-Score”  is  the  transformed  mean  score  derived  from  an  arcsine  square 
root  procedure. 
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Table  13.  Connected  speech  passages  (CST):  Comparision  of  cell  means  collapsed  across 
threshold  conditions  through  HELOS  by  noise  subgroups.  Those  pairs  found  significantly 
(p<.05)  different  are  indicated  by  different  group  numbers.  Critical  difference  equals 
0.0425. 
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Figure  12a.  HELOS  by  noise  conditions  for  CST  passages.  The  “T-Score”  is  the 
transformed  mean  score  derived  from  an  arcsine  square  root  procedure. 
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Figure  12b.  HELOS  by  distortion  conditions  for  CST  passages.  The  “T-Score”is  the 
transformed  mean  score  derived  from  an  arcsine  square  root  procedure. 
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Table  14.  Connected  speechpassages  (C ST):  Pairwise  Tukey  WSD  comparison  of 
individual  cell  means  within  each  experimental  TD  setting  by  noise  subgroup.  Those 
pairs  found  significantly  (p<.05)  different  are  indicated  by  different  group  numbers. 
Critical  difference  equals  0.1335. 
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Table  15.  Connected  speech  passages  (CST):  Comparison  of  cell  means  collapsed 
across  threshold  conditions  through  HELOS  by  distortion  settings.  Those  pairs  found 
significantly  (p<.05)  different  are  indicated  by  different  group  numbers.  Critical 
difference  is  equal  to  0.0862. 


HELOS 

Distortion 

Mean 

Transformed 

Mean 

Grouping 

-H 

D1 

91.8 

1.339 

1 

D2 

93.3 

1.361 

1 

D3 

94.1 

1.387 

1 

+H 

D1 

88.1 

1.270 

1 

D2 

88.0 

1.270 

1 

D3 

82.8 

1.204 

1 

93 


Table  16.  Pearson  Product  moment  correlations  between  speech  materials  across 
listening  conditions.  Astericks  (*)  indicate  those  correlations  which  were  significant 
with  a p<.05. 
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CHAPTER  5 
DISCUSSION 


The  historical  foundations  of  audiology  are  found  in  the  area  of  audiologic 
management  or,  as  it  has  been  known  traditionally,  aural  rehabilitation.  The  basic  goal  of 
today’s  audiologists  and  related  professionals  is  still  to  help  individuals  with  hearing 
impairment  and  their  family  and  associates  to  communicate  more  effectively  with  each 
other  and  the  rest  of  the  world.  Even  though  audiologists  and  related  professionals 
recognize  a variety  of  management  procedures,  the  ultimate  goal  is  determining  the  critical 
factors  that  are  involved  in  and  that  impair  the  accuracy  of  speech  perception.  If  these 
factors  could  be  identified  and  compared  to  their  presence,  absence  or  distortion  in  the 
impaired  ear,  the  resulting  information  could  be  applied  to  beneficial  treatment  of  hearing 
impairment.  This  information  would  impact  technology  used  to  restore  an  accurate 
speech  signal  to  the  impaired  ear,  such  as  in  hearing  aids,  assistive  listening  devices  and 
auditory  implants.  It  would  also  impact  management  and  training  procedures,  such  as  in 
auditory  and  speechreading  training,  educational  methods,  and  use  of  vocational  skills. 
Reliable  and  valid  procedures  to  measure  the  presence  or  absence  of  these  critical  factors 
are  the  bridge  between  information  regarding  speech  perception  and  its  application  to 
treatment.  Couple  this  need  with  economic  demands  of  health  care,  there  is  a necessity  for 
efficient  and  thorough  assessment  of  the  speech  perception  abilities  of  a person  with 
hearing  impairment.  Clearly,  basic  and  applied  science  have  an  interacting  relationship 
that  is  vital  for  the  growth  of  scientific  knowledge  and  effective  audiologic  management. 
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The  general  purpose  of  this  study  was  to  investigate  the  effects  that  acoustic 
distortions  have  on  accurate  speech  perception  using  a hearing  loss  simulation  procedure 
in  listeners  with  normal  hearing.  The  majority  of  previous  studies  have  investigated 
isolated  acoustic  parameters  and  their  effect  on  recognition  of  syllabic,  word  and  sentential 
material.  Although  one  of  the  most  common  complaints  of  listeners  with  hearing 
impairment  is  understanding  conversational  speech  in  noisy  environments,  most  simulation 
studies  have  not  presented  stimuli  in  background  noise.  Specifically,  this  study 
investigated  the  effects  of  combined  acoustic  distortions  on  the  recognition  of  connected 
speech  in  quiet  and  noise.  Therefore,  data  was  collected  on  recognition  of  nonsense 
syllables,  words  and  passages  presented  in  quiet  and  noise  under  hearing  loss  simulation 
conditions  that  manipulated  threshold  level  and  other  types  of  distortion. 

Effects  of  Combined  Acoustic  Distortions 

The  findings  of  this  study  can  be  summarized  according  to  the  acoustic  parameters 
investigated:  threshold,  other  types  of  spectrum  distortion  and  environmental  noise. 
Considering  the  dynamic  variability  of  speech  signals  and  the  multiple  factors  involved  in 
accurate  perception  by  a listener,  it  was  not  surprising  to  find  that  different  acoustic 
parameters  played  dominant  roles  in  different  listening  conditions. 

Threshold  attenuation  was  the  dominant  factor  in  the  recognition  of  monosyllabic 
words  and  nonsense  syllables  when  presented  in  quiet.  Increased  distortion  significantly 
decreased  recognition  when  stimuli  were  presented  without  threshold  attenuation,  but 
surprisingly,  not  in  combination  with  threshold  attenuation.  The  opposite  was  true  when 
stimuli  were  presented  in  noise.  Significant  differences  were  found  only  when  listeners 
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heard  words  and  syllables  under  combined  threshold  and  distortion  conditions,  not  under 
threshold  or  distortion  conditions  alone. 

These  findings  suggest  that  when  individuals  with  mild  to  moderate  hearing 
impairment  listen  in  quiet  surroundings,  the  dominant  factor  in  understanding  a speaker  is 
threshold  sensitivity,  even  if  frequency,  loudness,  and  temporal  distortions  are  present  in 
the  listener’s  ear.  However,  when  listening  in  noisy  environments,  the  noise  creates  an 
auditory  environment  where  the  effects  of  threshold  attenuation  and  distortions  combine. 
Thus,  speech  understanding  is  reduced  even  more  than  can  be  accounted  for  by  threshold 
or  distortion  changes  alone.  Finitzo-Hieber  & Tillman  (1978)  and  Nabelek  & Mason 
(1981)  have  reported  data  that  demonstrate  that  the  combined  effects  of  distortions  of  the 
signal  in  the  sound  field  , such  as  noise  and  reverberation  time,  are  greater  than  the  sum  of 
both  effects  measured  separately.  Lacroix,  Harris  & Randolph  (1979)  also  found  this 
result  to  be  true  when  they  varied  combinations  of  low-pass  filtering  with  time 
compression,  speech  interruptions,  and  masking. 

This  situation  is  observed  clinically  when  individuals  with  mild  to  moderate  hearing 
impairment  can  function  with  little  difficulty  in  a quiet  listening  environment  (as 
demonstrated  during  interviewing  and  speech  recognition  testing  typically  presented  in 
quiet).  However,  they  report  difficulty  understanding  speech  in  groups  and  noisy  listening 
environments.  Certainly,  the  neurophysiologic  thresholds  of  most  people  do  not  change 
from  one  environment  to  the  other.  Other  internal  and  external  acoustic  factors  must  be 


involved. 
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Results  of  this  study  would  help  explain  the  variability  seen  in  speech  recognition, 
particularly  in  noise,  by  individuals  that  have  similar  audiometric  thresholds  and 
configuration.  Although  many  of  these  individuals  report  benefit  from  hearing  aid 
amplification,  particularly  in  quiet,  they  commonly  complain  that  speech  understanding  in 
noise  is  not  improved  and  often  worsened  (Smedley  & Schow,  1990).  Unfortunately, 
improved  speech  recognition  in  these  listening  environments  is  their  greatest  need  and 
desire. 

The  findings  for  monosyllabic  words  and  nonsense  syllables  in  quiet  are  consistent 
with  studies  reported  by  Fabry  and  Van  Tassel  (1986),  Humes,  et  al.  (1987),  and  Wang, 
Reed  and  Bilger  (1978).  These  authors  concluded  from  their  data  obtained  in  quiet  that 
speech  recognition  was  significantly  affected  by  threshold  attenuation  as  measured  by 
pure  tone  sensitivity.  However,  this  investigation’s  data  in  noise  do  not  agree  with  results 
reported  by  Zurek  and  Delhome  (1987)  and  Needleman  and  Crandell  (1995)  who 
presented  nonsense  syllables  in  noise  to  simulation  listeners  and  listeners  with  hearing 
impairment.  Both  studies  found  no  significant  differences  between  subjects  for  nonsense 
syllable  recognition.  The  authors  concluded  that  threshold  attenuation  or  audibility  of  the 
signal  was  the  dominant  factor  in  accurate  recognition  of  nonsense  syllables.  One  of  the 
strengths  of  these  two  studies  is  that  methodology  controlled  for  audibility  of  the  signal. 

Other  researchers  have  reported  data  indicating  that  threshold  attenuation  alone 
does  not  account  for  significant  differences  in  speech  recognition  (Dirks  & Ellison,  1989; 
Gagne  & Erber,  1987;  Turner  & Robb,  1987;  Villchur,  1977;  and  Walden,  et  al.,  1981). 
Instead,  other  types  of  distortion  contributed  by  the  hearing  impairment  had  a greater 
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impact  than  threshold,  or,  when  in  combination  with  threshold  attenuation,  sometimes 
demonstrated  a multiplicative  effect  under  noisy  listening  conditions. 

The  role  of  threshold  attenuation  and  other  distortions  is  even  more  complex  in  the 
recognition  of  connected  speech.  The  results  of  this  study  found  that  threshold  was  a 
significant  effect  regardless  of  listening  condition  or  experimental  variable.  Distortion  also 
significantly  influenced  recognition  scores  under  hearing  loss  simulation  conditions, 
although  statistical  procedures  could  not  specify  the  influence.  Group  performance  means 
for  the  CST  showed  a clear  pattern  of  decreasing  recognition  as  variables  were  combined. 
In  other  words,  recognition  was  always  poorer  in  noise  than  in  quiet  but  became  worse  as 
the  HELOS  was  on  with  combined  threshold  and  distortion  settings. 

These  findings  are  consistent  with  data  on  sentence  recognition  reported  by  Baer 
and  Moore  (1993),  Gagne  and  Erber  (1987),  Moore  and  Glasberg  (1993),  Moore, 
Glasberg  and  Vickers  (1995),  Needleman  and  Crandell  (1995),  Plomp  (1978,  1986)  and 
Villchur  (1974,  1977).  The  majority  of  these  studies  used  methods  to  control  for 
audibility  of  the  signal  as  it  was  presented  under  simulated  frequency  distortion,  temporal 
changes  and  various  noise  levels.  All  reported  that  key  word  and  whole  sentence 
recognition  were  significantly  affected  by  suprathreshold  acoustic  distortions  and 
background  noise. 

There  is  a paucity  of  literature  regarding  recognition  of  connected  speech  passages 
by  listeners  with  true  or  simulated  hearing  loss.  One  of  the  limitations  of  research 
previously  reported  is  the  assumed  relationship  between  recognition  of  nonsense  syllables, 
monosyllabic  words  and  isolated  sentences  to  everyday  conversational  speech.  This  lack 
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of  information  is  due  to  the  complexity  of  connected  speech  and  the  extreme  difficulty,  if 
not  impossibility,  of  experimental  control.  This  complexity  is  caused  by  the  fact  that 
connected  speech  is  no  longer  only  an  auditory  event  Boothroyd  (1993),  Moore  (1989) 
and  Nerbonne  and  Schow  (1989)  provide  explanations  of  the  multiple  variables  involved 
in  communication,  ranging  from  acoustic  characteristics  of  the  signal  and  the  environment, 
to  the  linguistic  context  and  to  the  physical  and  mental  abilities  of  the  speaker  and  listener. 
Connected  speech  is  characterized  by  redundancy  as  a result  of  these  multiple  variables 
that  involving  acoustic  and  linguistic  cues  that  provide  the  listener  with  speech 
information.  In  addition,  listeners  use  “world  knowledge”  to  understand  contextual 
speech.  For  example,  if  a topic  is  known  and  the  listener  is  familiar  with  vocabulary  and 
meanings  related  to  that  topic,  the  listener  is  better  able  to  understand  an  unclear  speech 
signal. 

The  powerful  impact  of  listener  knowledge  set  or  readiness  is  evident  even  with 
minimal  cueing.  For  example,  the  transitional  cues  and  knowledge  of  phonetic  rules  are 
used  to  predict  a distorted  or  inaudible  speech  phoneme  (Boothroyd,  1993).  This 
readiness  advantage  has  also  been  found  in  higher  performance  in  recognizing  speech 
stimuli  from  an  open  versus  closed  set  (Holmes,  Kricos  & Kessler,  1988).  Although  the 
recognition  scores  of  this  study  were  not  tested  for  significant  differences  across  speech 
stimuli,  the  general  trend  for  group  means  were  higher  for  the  closed  set  of  nonsense 
syllables  than  for  the  open  set  of  monosyllabic  words.  Recognition  scores  were  the 
highest  for  CST  passages  which  are  a closed  set  (topic  known)  and  very  redundant  with 
acoustic  and  semantic  information.  Individuals  with  normal  hearing  depend  on  these 
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redundant  cues  to  understand  a degraded  speech  signal  in  noise.  Speech  is  even  less 
redundant  for  listeners  with  hearing  impairment  because  the  physical  loss  of  auditory 
function  results  in  many  acoustic  cues  being  lost  or  obscured.  Thus,  they  typically  have 
more  difficulty  understanding  speech  in  noisy  environments  than  listeners  with  normal 
hearing  (Boothroyd,  1993;  Nabelek,  1993).  Regardless  of  these  complex  interacting 
variables,  the  study  of  connected  speech  recognition  and  its  reliable  assessment  is 
necessary  to  provide  and  evaluate  the  effective  treatment  provided  to  individuals  and 
families  with  hearing  impairment. 

In  summary,  the  speech  recognition  results  of  this  study  indicated  that  threshold 
attenuation  is  not  the  major  component  in  inaccurate  speech  recognition  under  all  listening 
conditions  for  all  types  of  speech  material.  Neither  are  other  distortions,  such  as 
decreased  frequency  selectivity,  loudness  recruitment,  and  abnormal  temporal  processing. 
The  role  that  each  type  of  acoustic  distortion  plays  changes  as  it  interacts  with  the  multiple 
variables  involved  in  a dynamic  speech  signal.  However,  these  results  provide  further 
evidence  that  recognition  of  all  types  of  speech  materials,  including  connected  speech,  is 
affected  more  when  threshold  and  other  distortions  are  active  in  combination,  particularly 
when  listening  in  noise. 

Comparison  of  Speech  Recognition  and  HELPS  Output 

The  ANOVA  results  of  this  study  must  also  be  considered  in  relation  to  the 
acoustic  parameters  of  the  signal  received  by  the  subject.  According  to  simulation  pure 
tone  thresholds  (Figure  3),  the  simulated  hearing  loss  created  by  the  HELOS  was  not 
“flat”  or  equally  attenuated  across  audiometric  frequencies.  The  U-shape  configuration 


101 


across  HELOS  experimental  settings  indicates  a moderate  hearing  loss  at  1000  Hz  rising 
to  a mild  hearing  loss  at  8000  Hz  or  better.  It  is  well  established  that  perception  of  high 
frequencies  is  critical  for  speech  recognition  of  phonemes  and  monosyllables  (Fletcher, 
1953,  French  & Steinberg,  1947;  Miller  & Nicely,  1955).  Studies  of  speech  recognition 
and  development  in  children  with  hearing  impairment  indicate  the  necessity  of  hearing  the 
second  formant  of  vowels  and  consonant  energy  found  above  2000  Hz  to  perceive  place 
of  articulation  cues.  Based  on  these  facts  and  the  rising  simulation  pure  tone  thresholds, 
it  was  not  surprising  that  recognition  scores  of  the  IMCT  nonsense  syllables  were  better 
than  NU-6  monosyllabic  word  scores.  In  addition,  subjects  had  the  advantage  of  a closed 
set  with  the  IMCT  syllables.  However,  the  performance-intensity  function  for  nonsense 
syllables  indicates  that  it  is  the  most  difficult  recognition  task,  due  to  the  minimal  number 
of  cues  available  (Penrod,  1994).  The  NU-6  words  not  only  include  more  inherent 
acoustic  cues  from  phonemic  transitions  and  phonetic  rules,  but  the  words  were  also 
introduced  with  a carrier  phrase.  In  addition,  the  same  threshold  attenuation,  bandwidth 
(BW)  and  temporal  distortions  were  being  imposed  on  the  NU-6  words  as  on  the  IMCT 
syllables. 

Perhaps  the  answer  is  in  these  very  arguments.  Masking  studies  investigating  the 
effects  of  one  signal  masking  another  signal  have  shown  that  masking  patterns  (using 
excitation  or  auditory  filter  plots)  have  shallower  and  broader  high  frequency  slopes  or 
tails  (Egan  & Hake,  1950),  The  amount  of  masking  increases  in  a nonlinear  fashion  on  the 
high  frequency  slope.  For  example,  when  the  intensity  of  a low  frequency  masker  is 
increased  by  X dB,  the  masked  threshold  of  a high  frequency  signal  is  elevated  by  more 
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than  X dB.  This  psychoacoustic  phenomena  has  been  called  the  “upward  spread  of 
masking”  (Moore,  1989).  Studies  investigating  other  types  of  masking,  such  as  forward 
and  backward  masking  indicate  that  the  intensity  of  one  signal  can  affect  the  threshold  of 
the  signal  immediately  preceding  or  succeeding  it  (Duifhuis,  1973;  Kidd  & Feth,  1982). 
Perhaps  the  longer  and  acoustically  more  complex  carrier  phrase  for  the  NU-6  words  was 
providing  masking  affects  in  the  upper  frequencies.  The  greatest  energy  of  the  connected 
speech  signal  is  found  around  500  Hz  which  is  related  to  the  first  vocal  tract  formant  or 
resonance  related  to  the  fundamental  frequency  of  the  speaker’s  voice  . The  RMS  values 
for  pure  tone  input  signals  for  each  HELOS  setting  reflect  the  behavioral  pure  tone 
thresholds  with  the  greatest  voltage  measured  at  250  Hz.  The  simulated  pure  tone 
thresholds  for  250  and  500  Hz  could  provide  upward  spread  of  masking  to  the  high 
frequencies.  This  information  would  suggest  that  the  simulated  pure  tone  thresholds  were 
misleading  and  the  speech  spectrum  output  did  create  a perceived  “flat”  hearing  loss. 

The  monotonic  relationship  between  frequency  bandwidths,  number  of  periodicity 
changes  and  increased  HELOS  settings  was  reflected  by  the  decreasing  recognition  scores 
as  combined  threshold  and  distortion  listening  conditions  increased,  particularly  in 
background  noise.  Analysis  of  consonant  errors  made  for  the  NU-6  words  and  IMCT 
syllables  would  allow  further  comparisons  between  the  stimuli  and  the  electroacoustic 
output  of  the  HELOS.  This  procedure  is  the  next  stage  of  analyzing  the  data  obtained 
during  this  investigation. 
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Correlations  of  Speech  Materials 

Another  experimental  question  of  this  study  was  determining  if  there  was  any 
significant  correlation  between  the  speech  materials  used.  People  seek  evaluation  of  their 
hearing  because  of  difficulties  understanding  conversational  speech.  According  to  clinical 
practice  surveys,  99%  of  the  responding  audiologists  administered  word  recognition  tests 
as  the  primary  measure  of  speech  perception  ability.  Of  this  number,  75%  reported  using 
a single  presentation  level  in  quiet  related  to  a spondee  speech  reception  threshold  and 
60%  reported  always  using  25  words  (Martin,  Armstrong,  & Champlin,  1994).  The 
validity  of  these  procedures  is  highly  questionable  in  terms  of  assessing  the  stated  concerns 
of  most  individuals  seeking  hearing  help.  However,  economic  constraints  related  to  time 
efficiency,  operational  costs,  etc.  May  support  the  current  clinical  practices.  Obviously, 
one  solution  is  to  change  procedures.  There  is  now  considerable  evidence  that  presenting 
speech  tests  at  higher  presentation  levels  and  in  noise  can  efficiently  provide  diagnostic 
medical  information  and  is  more  reflective  of  a person’s  speech  perception  abilities 
(Mendel  & Danhauer,  1997).  In  light  of  this  data,  it  is  difficult  to  understand  the 
continuing  practices  of  the  majority  of  practicing  audiologists.  Perhaps  it  is  a function  of 
habit  and  familiarity,  being  confident  in  procedures  learned  during  initial  professional 
training. 

Therefore,  another  solution  would  be  to  determine  if  any  current  speech 
recognition  test  could  be  related  to  a person’s  ability  to  understand  conversational  speech. 
The  results  of  this  study  suggests  that  there  is.  The  significant  correlations  between 
recognition  of  NU-6  monosyllabic  word  and  the  CST  passages  when  presented  in  noise 
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indicate  that  predictive  score  values  could  be  developed.  This  line  of  research  needs  to  be 
done  with  listeners  with  hearing  impairment.  Because  understanding  conversational 
speech  is  so  dependent  on  situational  variables,  self-assessment  of  perceived  listening 
difficulty  and  overall  handicap  could  be  used  to  help  validate  these  predictive  values.  This 
evaluation  tool  would  be  very  useful  in  diagnostic  and  management  procedures. 

Implications  for  Future  Research 

Regardless  of  the  level  of  experimental  control,  every  research  study  has  its 
limitations.  The  positive,  and  often  exciting,  outcome  of  these  limitations  is  the  generation 
of  more  questions  and  ideas  related  to  modifying  the  study  and/or  to  a new  investigation. 
The  results  of  this  study  and  their  meaning  need  to  be  examined  according  to  the 
methods  used.  There  are  several  areas  that  yield  future  avenues  of  research. 

Simulation  of  hearing  loss 

This  study  used  the  method  of  simulating  hearing  impairment  in  listeners  with 
normal  hearing  to  investigate  speech  recognition  of  a distorted  auditory  signal.  This 
method  is  based  on  the  underlying  assumption  that  the  characteristics  of  sensorineural 
hearing  impairment  can  be  duplicated  with  manipulation  of  acoustic  parameters,  e g., 
frequency  filtering,  noise  masking,  time  alternation,  abnormal  loudness  functions,  etc. 
Simulation  of  hearing  impairment  has  been  reported  or  verified  by  “matching”  the 
behavioral  responses  of  simulation  listeners  with  normal  hearing  to  those  responses  of 
listeners  with  hearing  impairment.  The  matching  criteria  most  typically  used  is  pure  tone 
thresholds.  Yet,  studies  reviewed  have  shown  that  pure  tone  thresholds  and  speech 
recognition  test  scores  are  not  highly  correlated  for  the  population  with  hearing 
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impairment.  Because  simulation  data  is  equivocal  regarding  similarities  between 
simulation  listeners  and  listeners  with  hearing  impairment,  the  truth  of  this  assumption 
must  be  considered. 

Experimental  procedures 

The  procedures  used  in  this  study  could  be  modified  for  more  exact  control  and 
measurement  of  subject  responses.  These  modifications  could  include  using  an  adaptive 
procedure  to  establish  thresholds  and  determine  a SRT-in-noise  ratio.  This  modification 
would  not  only  provide  further  control  subject  and  experimental  variables,  but  allow  for 
efficient  presentation  of  stimuli  in  multiple  levels  and  types  of  noise. 

The  use  of  the  IMCT  nonsense  syllables  was  a possible  limitation.  Although  it  has 
been  used  extensively  in  evaluation  of  persons  with  cochlear  implants,  it  does  not  have  the 
psychometric  research  data  as  some  other  available  nonsense  tests.  Also,  the  lists  used 
were  limited  to  one  vowel  context,  /i/.  Other  tools,  such  as  the  Nonsense  Syllable  Test 
(NST,  Edgerton  & Danhauer,  1979)  and  the  Distinctive  Feature  Difference  Test  (DFD, 
Feeney  & Franks,  1982),  are  a few  of  the  tools  available  that  assess  perception  of 
consonants  in  a variety  of  vowel  contexts. 

Further  investigation  of  the  Connected  Speech  Test  (CST)  needs  to  be  done  The 
small  range  of  recognition  scores  across  experimental  conditions  may  indicate  that  it  was 
too  easy  for  the  subjects,  regardless  of  its  strong  psychometric  development  and 
standardization  data.  It  would  be  interesting  to  test  for  recognition  differences  when  the 
topic  of  the  passage  was  and  was  not  provided  to  the  listener.  The  CST  may  prove  more 
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useful  in  speech  recognition  assessment  in  the  auditory-visual  mode.  This  information 
would  be  very  useful  in  training  procedures. 

HELPS 

The  purpose  of  the  HELPS  is  to  provide  simulation  of  multiple  degrees  and 
characteristics  of  hearing  loss.  The  primary  information  gathered  during  this  study  is  that 
the  exact  output  of  the  HELPS  is  unknown.  Coupled  with  the  reluctance  of  the  designer 
to  provide  electronic  specifications,  this  author  does  not  recommend  that  the  HELPS  be 
used  in  hearing  research.  The  fundamental  principles  of  its  design  in  terms  of  combining  a 
variety  of  acoustic  distortions  is  appropriate  and  useful  in  hearing  loss  simulation  for 
clinical  treatment.  However,  the  lack  of  specific  information  regarding  the  relationship 
between  the  control  settings  and  audiometic  data  unfortunately  limits  its  usefulness.  In 
addition,  recent  advances  in  digital  technology  provides  a much  greater  ability  to  generate 
multiple  signals  within  very  narrow  parameter  specifications.  Previously,  using  analog 
devices  to  research  perception  of  distorted  signals  had  to  account  for  the  distortion 
inherent  in  the  equipment  used.  Digital  technology  allows  for  signals  to  be  generated  and 
presented  with  negligible  influence  from  the  electronic  device.  Therefore,  this  author  will 
not  conduct  further  investigations  within  this  area  of  speech  perception  using  the  HELPS. 
However,  the  device  may  be  a useful  tool  in  treatment  efficacy  research  related  to  the 
perception  of  hearing  handicap  and  communication  fluency. 


APPENDIX  A 

IOWA  MEDIAL  CONSONANT  TEST  (IMCT) 


Consonant  Phonetic  Pronunciation 
Symbol 


b 

bee 

ch 

t5 

chee 

d 

dee 

f 

fee 

g 

go  (hard) 

h 

hee 

J 

jee 

k 

kee 

1 

lee 

m 

mee 

n 

nee 

ng 

d 

ring 

P 

pee 

r 

ree 

s 

S 

see 

sh 

she 

t 

tee 

th 

f 

thee  (hard/voiced) 

th 

e> 

thin  (soft/  unvoiced) 

V 

vee 

w 

wee 

y 

j 

yee 

z 

zee 

z 

3 

measure  (soft) 
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APPENDIX  B 

EXAMPLES  OF  CONNECTED  SPEECH  TEST  PASSAGES 
(COX.  ALEXANDER  «&  GILMORE,  1987) 


#1,  WINDOW 

Windows  PROVIDE  LIGHT  and  air  to  ROOMS. 

Windows  were  ONCE  COVERED  with  CRUDE  SHU  riERS. 

Later,  oiled  PAPER  was  USED  for  windowpanes. 

Glass  windows  first  appeared  in  ancient  Rome. 

Colored  glass  was  used  in  European  windows. 

Some  churches  were  famous  for  their  beautiful  windows. 
These  windows  DISPLAYED  PICTURES  from  the  BIBLE. 

Pieces  of  glass  were  held  together  by  lead. 

Such  windows  may  be  seen  in  French  cathedrals. 

English  churches  also  contain  STAINED  glass  windows. 

#3.  UMBRELLA 

The  NAME  "umbrella"  means  small  shadow. 

Umbrellas  WERE  first  used  in  ancient  Egypt. 

They  gave  protection  from  the  fierce  sunshine. 

Slaves  held  umbrellas  over  their  masters. 

In  Egypt  today,  many  people  CARRY  umbrellas. 

In  EARLY  Rome,  ONLY  WOMEN  used  umbrellas. 

If  a MAN  did,  he  was  considered  a sissy. 

Umbrellas  were  USED  by  both  sexes  in  England. 

Today,  people  use  umbrellas  to  keep  our  the  rain. 
Umbrellas  USED  as  sunshades  are  called  parasols. 

#4.  GIRAFFE 

The  giraffe  is  the  tallest  wild  ANIMAL 
It  is  three  times  taller  than  a man. 

A full  grown  giraffe  is  eighteen  FEET  high. 

The  giraffe  has  an  extremely  LONG  NECK. 

The  neck  HAS  ONLY  seven  NECKBONES. 

The  GIRAFFE'S  BODY  is  about  the  SIZE  of  a HORSE'S. 

The  BODY  is  SHAPED  LIKE  a triangle. 

Afnca  is  the  only  COUNTRY  WHERE  giraffes  LIVE  wild. 

Large  groups  of  them  are  found  on  the  plains. 

They  live  there  with  LIONS  and  ELEPHANTS. 
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APPENDIX  C 

RELIABILITY  OF  TRANSCRIBERS 


Transcriber 

Checker 

# Items 
Agreement 

# Items 
Disagreement 

% Reliability 

CB 

JT 

96 

0 

100.0 

JT 

94 

2 

97.9 

AK 

93 

3 

96.8 

AK 

95 

1 

99.0 

JT 

CB 

93 

3 

96.9 

CB 

94 

2 

97.9 

AK 

96 

0 

100.0 

AK 

CB 

95 

1 

99.0 

CB 

94 

2 

97.9 

JK 

92 

0 

100.0 

Mean 

98.5 

Mean  for  CB 

98.4 

determined  by  point-by-point  agreement.  Percent  reliability  is  the  number  of  items  in 
agreement  divided  by  the  sum  of  items  in  agreement  and  disagreement.  CB  transcribed 


for  78%  (47)  of  the  participants. 
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APPENDIX  D 

PURE  TONE  AND  SPEECH  THRESHOLDS 
RAW  DATA 
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